ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


THE VARIATIONAL THEORY IN THE LARGE INCLUDING THE 
NON-REGULAR CASE—FIRST PAPER 


By GrorGEe EwinG anp Marston Morse 
(Received July 8, 1942) 


Introduction 


The classical minimum theory was first developed in the so-called regular 
case in which the Euler equations are non-singular. In obtaining the proper- 
ties of the minimizing curve use was made of elementary extremals. Carathé- 
odory studied a non-regular problem in 1906, but Tonelli was the first to make 
an extensive study of non-regular problems. This extension was real inasmuch 
as many important integrals fail to come under the regular case. Notable 
among such examples is the Jacobi least action integral I in the restricted 
problem of three bodies. A desire to understand I was one of the forces moti- 
vating this study. For it seems possible that an analysis of the contour mani- 
folds of I in the Fréchet space of the admissible curves may reveal a hitherto 
unknown topological basis for the planetary orbits. 

The theory in the large has a topological form independent of its application 
to functions of n variables to simple or double integrals, or to other problems. 
See Morse [1]. However, up to the present the application to the ordinary 
simple integral has presupposed the condition of positive regularity and has 
made definite use of broken extremals. It has been an open question whether 
the regularity condition could be relaxed in the theory in the large as it was 
by Tonelli in the minimum theory. The basic difficulties have now been met 
and the new theory includes the old. 

A first change was in the theory of functions on a compact metric space. Two 
auxiliary metrics, a J-metric and an L-metric, had to be introduced to describe 
and establish the upper-reducibility and bounded compactness of J. Recall 
that a function J(g) is “‘bowndedly compact” if for each constant c the subset of 
points g, for which J(g) S c, is compact. This change in the general theory 
has been made by Morse, [2]. 

The case of the simple integral without the regularity hypothesis is treated 
by Ewing and Morse in two papers of which this is the first. The second of 
these papers contains the major part of the new results. This first paper gives 
the conditions on the integrand f sufficient to insure the bounded compactness 
of J and the equivalence of convergence in length and J-length. The conditions 
of convexity and positive semi-normality which are used have received extensive 
attention. Besides Tonelli, a number of others including Hahn, Menger, 
Graves and McShane have made major contributions. We are particularly 
indebted to McShane as will be seen by our references. In this first paper we 
are thus concerned largely with exposition. It was necessary to bring the 
relevant material together in one place and put it in the form best suited to our 
purposes. 

339 


340 GEORGE EWING AND MARSTON MORSE 


New proofs have been introduced, particularly in deriving the properties of the an 
“figurative” from its convexity and the homogeneity of f, without using the mot 
differentiability of f. We also introduce the ‘‘pseudo limiting’”’ curve and em- i 
ploy the concept of “regular convexity.”’ A generalized Lindeberg theorem is (1.4) 
stated with a reduction of hypotheses and a simplification of MeShane’s proof. th 
The difficult problem of conditions sufficient for upper reducibility, and the - 
proof of the generalized Euler theorem (the homotopy theorem) are left for the - 
second paper, as well as the integration of the local hypotheses with the theory me 
in the large. wer 

Wh 
1. The conditions on f(z, 7) metric 

The symbols x and r will designate sets --- , 2°] and [r™, --- , 
respectively. We shall refer to x and r as vectors and shall use the notation of ud ek 
vector analysis. Regarded as a point, x shall be restricted to a bounded closed os 
connected set A in an n-dimensional cartesian space. Let R denote the cartesian rap 
space of points r._ The integrand in our variational theory will be defined with th " 
the aid of a function f(x, r) that is finite and single-valued for x on A andr F es 
on R. We shall condition f as follows: “or . 

I. The function f shall be homogeneous in r in the sense that i i 

ue 
(1.1) f(x, kr) = kf(a, r) k2 0. the ori 

Il. The function f shall be convex in r for each x. net " 

A first consequence of IT is that f(x, r) is continuous in r. See Bonnesen und t : 
Fenchel [1], p. 19. 

When f is convex in 7 at « = c, corresponding to c and to an arbitrary vector r : os 
there exists at least one vector a(c, 7) such that for arbitrary r and fixed 79 then 

( 

(1.2) f(c, r) = a-(r — 1) + fle, 7%) of the 
where a-(r — 7) is the scalar product of a by (r — 7»). -. 

We shall show that “ 

If f( 

(1.3) a-r = 1). with a 
* 
Upon setting r = 27) and 0 respectively in (1.2), one finds that f "f 
S(c, 70) = 0 = f(c, m) —a-%. will det 
Relation (1.3) follows. at vee 
We shall say that f is positive semi-normal at x = c (written P.S.N.) if ie 
f(r) > b-r (r 0) such th 
for a vector constant 6 and any r # 0. ptai 

We begin with certain lemmas in which c is fixed. (1.5) 

Lemma 1.1. If f(c,r) = 0 for all r, the subset in the space R on which f(c, r) = 0 ie 
is convex. | 


Let 7’ and r” be two points r at which f(c, r) = 0. On the line segment L 


J 


VARIATIONAL THEORY IN THE LARGE. I 341 


joining 7’ to r’’, f(c, r) is convex in r, and hence f(c, r) S$ 0. But f(c, r) is never 
negative by hypothesis, so that f(c, 7) = 0 on L, and the proof is complete. 
We shall make use of the condition 


(1.4) fc, r) + fle, —r) > 0 (r # 0) 
with c fixed and r an arbitrary non-null vector. 

LemMa 1.2. If f is non-negative in r at x = c, and if (1.4) holds, there exists 
an (n — 1) plane II, of the form A-r = 0 such that f(c, r) > 0, where A-r = 0 
andr ¥ 0. 

When n = 2 the lemma follows readily. For f(c, r) cannot vanish at dia- 
metrically opposite points r and —r by virtue of the condition (1.4); and in 
accordance with Lemma 1.1 must vanish, if at all, in a convex domain bounded 
by two rays through the origin. The 1-plane II, of the lemma is thus a suitably 
chosen straight line. To establish the lemma in general we assume that the 
lemma is true when n = m — 1, and show that it is true when n = m. 

In the m-space of points r let II,,1 be an arbitrary (m — 1)-plane through 
the origin. The function f(c, r) will be a convex function ¢@ of rectangular 
coordinates on II,_;, and the condition corresponding to (1.4) holds for ¢, 
implying that ¢ does not vanish at diametrically opposite points of II,.. By 
virtue of our inductive hypothesis there exists an (m — 2)-plane IIm_» through 
the origin, with II n-2 C Tn such that f(c, r) > 0 on I,1 on the closure of 
one side of IIm-2, 7 = 0 excepted. 

Let bean arbitrary half (m — 1)-plane with asa boundary. Onno 
two diametrically opposite half planes of this type can f(c, r) = 0 on both half 
planes. This is true for diametrically opposite points by virtue of (1.4), and 
for other pairs of points on the two half planes by virtue of Lemma 1.1, bearing 
in mind that f(c, r) > 0 on II # 0. There accordingly exists at least one 
of the half planes pm, Say ¥m—1, on the closure of which f(c, r) > 0, r ¥ 0. 
Let @ be the angle which y,,_; makes with v,,; measuring @ in a definite sense, 
with 0 @ < 2z. 

If f(c, r) = 0 at any point r other than r = 0, there will be a half plane yp»; 
with a maximum @ for half planes on which f(c, r) = 0 at some non-null point r. 
Let ux; designate this half plane. On the half plane opposite un, f(c, 7) ¥ 0 
for |r| = 1. Hence a half plane for which @ is slightly larger than on ais 
will determine an (m — 1)-plane II n_; satisfying the lemma. 

Lema 1.3. If fis non-negative in r at x = c, and if (1.4) holds, then f is P.S.N. 

By virtue of the preceding lemma there exists an (n — 1)-plane A-r = 0 
such that f(c, r) > 0 when A-r = O and r + 0. Let k be an arbitrary non- 
negative constant. Then 


(1.5) f(c,r) —kA-r >0 


when A-r < 0, since f 2 0. But the left member of (1.5) is positive when 
A-r = 0,k = 0, and |r| = 1, in fact is positive and bounded from zero. Hence 


e 
e 
Is 
f. 
e 
e 
y 
of 
d 
in 
h 
0. 
id 
0) 
: 


342 GEORGE EWING AND MARSTON MORSE 


the left member of (1.5) is positive when A-r = 0, r ¥ 0, and k is positive and 
sufficiently small. For such a k (1.5) holds for each r # 0, and the lemma 
follows. 

THEOREM 1.1. The condition that f be P.S.N. at x = c is equivalent to the 
condition (1.4). 

By virtue of the convexity of f(c, r) there exists a vector a such that f(c, r) = 
a-r. Setting 


(1.6) r) = r) — 
we see that ¢ is convex and non-negative at x = c. If (1.4) holds then 
r) + —r) > 0 (r ¥ 0) 


and we can apply the preceding lemma to ¢ to conclude that ¢ is P.S.N. at x = ¢. 
It follows that f is P.S.N. at x = c. 

On the other hand suppose that f is P.S.N. at x = ¢ so that f(c, r) > b-r for 
r ~ 0. Then 


r) + fc, -—r) > b-r —b-r =0 (r 0) 


and the proof of the theorem is complete. 

We shall say that f is regularly convex at c if (1.2) holds as stated, with a in 
(1.2) uniquely determined by (c, 7») for each 7 ¥ 0. 

Lemna 1.4. If f is regularly convex and non-negative at x = c, and if f(c,r) 
vanishes identically on some line L through the origin in the space R, then f(c, r) 
vanishes identically in r. 

Let 7 be a point r ~ 0, not on Z. Then, using (1.3), 


r) 2 — m) + fl = ar 


for a unique constant vector a and for every 7. In particular if p is a point r 
on L not 0, 0 2 a-p. This is possible on all of L only if a-p = 0. 
But f(c, p) = 0 by hypothesis, so that 


(1.7) f(e,r) 2 a-r = a-(r — p) + p) 


It is trivial that (1.7) holds with a replaced by 0. Since f is regularly convex 
at 2 = ¢ (1.7) can hold for but one constant a so thata = 0. Finally f(c, 7) = 
a-ryo = 0, so that f(c, r) vanishes identically in r. 

Lemma 1.5. If fis regularly convex and non-negative in r at x = cand f(c, r) 0 
in r, then f is P.S.N. atx = c. 

Under the hypotheses of the lemma it follows from Lemma 1.4 that f vanishes 
identically on no line through the origin in the space R, and hence condition 
(1.4) holds. We conclude from Lemma 1.3 that f is P.S.N. at x = ¢. 

TuroreM 1.2. If f is regularly convex at x = c the condition that f be P.S.N. 
at x = cis equivalent to the condition that f be non-linear in r at x = c. 

This theorem follows upon using (1.6) much as in the proof of Theorem 1.1, 
Lemma 1.5 replacing Lemma 1.3. 


Exa 
conve? 


Tot 
we nov 
III. 
IV. 

We: 
of g in 
a theo! 
of t. 
8 
t, s(t) 3 


Upon vu 
(2.1) 


The 
of g wh 
Wes 
LEM) 
for each 
and | r 
(, 7) 
gi(z, 7) 
Let k 
quence 
all valu 
Let o(x 


(2.2) 


can be u 
arbitrary 

gi 
fable cu 
we shall | 
in our fin 
of definit 

3A fur 
of class € 


and 


VARIATIONAL THEORY IN THE LARGE. I 343 


Examples will show that Theorem 1.2 is false if the condition of regular 
convexity be replaced by convexity. 


2. The lower semi-continuity of / 


To the conditions I and II of homogeneity and convexity imposed on f in §1 
we now add the following two conditions 

III. The function f shall be bounded for |r| = 1 

IV. The function f shall be lower semi-continuous in x and r. 

We admit curves g of finite length and shall use representations x(t), a < ¢ < b, 
of gin which x(t), i = 1, --- , n, is absolutely continuous int. It follows from 
a theorem in Carathéodory [1] p. 377 that f[x(é), <(¢)] is a measurable function’ 
of t. Let x = g(s) be the representation of g in terms of arc length with 0 < 
s S s. The function f(y, ¢) is measurable and bounded. As a function of 
t, s(t) is absolutely continuous. We thus have 


Upon using the homogeneity of f the latter integral takes the form 


[ a0) a. 


The integral (2.1) is thus independent of the particular representation x(t) 
of g which is used. We denote the integral (2.1) by J(g).’ 

We shall make use of the following lemma from McShane [2], p. 603. 

Lemma 2.1. If f is convex in r and conditioned as in I, II, III and IV, then 
for each constant k > 0 there exists a sequence of functions ¢,(x, r) which for x on A 
and |r| < k are convex in r for fixed x, of class’ C’ in (x, r), and such that for 
(x, r) fixed o,(x, r) converges to f(x, r) as pw becomes infinite, and 
gi(z, r)< < 

Let k be a positive constant and z,,(t), m = 0,1, 2,---,0 1 bea se- 
quence of admissible representations of curves on A, with | ¢» | < k for almost 
all values of ¢ and with z,,(t) converging uniformly to xo(f) as m becomes infinite. 
Let o(x, 7) be any one of the functions ¢,(x, r). We shall show that 


(2.2) lim [ im) dt = [ io) dt. 


1 The derivative z(t) exists almost everywhere. In a Lebesgue integral the integrand 
can be undefined on a set E of measure 0. Or one can set such an integrand equal to an 
arbitrary constant on the set EF. 

? If g is a non-rectifiable curve we could de.ne J(g) as the inferior limit of J(C) for recti- 
fable curves C as C converges in the sense of Fréchet to g. If f is P.S.N. at points on g 
we shall see in §3 that J(g) would then be infinite. Inasmuch as f will be assumed P.S.N. 
in our final theorems we do not find it useful to include non-rectifiable curves in the domain 
of definition of J(g). : 

* A function will be understood to be of class C’ on a subset M of a euclidean space if it is 
of class C’ on some region containing M. 


ma 
the 
)2 
0) 
= ¢, 
c, r) 
mtr 
0. 
nvex 
») = 
ishes 
ition 
S.N. 
11, 


344 GEORGE EWING AND MARSTON MORSE 


Since ¢(z, r) is convex in r 


0 
(2.3) | $(t0, tm) = "(im — Zo) 
where the superscript indicates evaluation for (x, r) = (ao, %). The ¢-integral 
of the right member of (2.3) tends to 0 with 1/m by virtue of a theorem in 
Hobson [1], §279. Accordingly 


1 1 


But the left members of (2.2) and (2.4) are equal by virtue of the uniform con- 
tinuity of g(z, r) for zon A and|r| <k. Relation (2.2) then follows as a con- 
sequence of this fact and of (2.4). 

TuHEoreEM 2.1. The integral J(g) is a lower semi-continuous function of g on 
any class of admissible curves of bounded length. 

Let gm, m = 1, 2, --- be a sequence of curves on A, converging in the sense 
of Fréchet to go, with lengths L(gm) at most a constant independent of m. 
In accordance with a lemma of McShane [1], p. 10, the curves g», can be given 
representations 2»(t),0 < ¢ < 1, such that | «| < k for some positive constant k, 
and for almost all values of ¢ on the interval [0, 1], while z,,(¢) converges uni- 
formly to x(t). Corresponding to the constant k let ¢,(x, r) be the sequence of 
functions affirmed to exist in Lemma 2.1. Regarded as a function of the repre- 
sentation z,,(t) the integral ; 


= [ , tm) dt 


is lower semi-continuous at 2 in accordance with (2.2). But as u becomes in- 
finite I,(rm) tends, without decreasing, to J(gm). Hence J(g) is lower semi- 
continuous at go , and the proof is complete. 


3. Conditions for bounded-compactness of J(q) 


Let c be an arbitrary constant. If for each c the set of admissible curves on 
which J(g) < ¢ is compact, we shall say that J is bowndedly compact. If L(g) 
is the length of g, the set of curves for which L(g) < cis compact, as is well known. 
We accordingly seek conditions on f and J under which any class of admissible 
curves for which J is bounded is a class for which L is bounded. 

Let x(t), m = 1; 2, --- be a sequence S of curve representations in which ¢ 
is the are length with 0 < ¢ < a,. In case a, becomes infinite with m, x(t) 
will be called a pseudo limit of S if x(t) is defined and absolutely continuous for 
0 <t < «, and if there exists a subsequence ¢,(t), 7 = 1, 2, --- , of the sequence 
tm(t) such that on each finite sub-interval for ¢, ¢,(t) converges uniformly to 
z(t). In this definition we are concerned with representations rather than 
curves. In the “pseudo-limit” x(t), ¢ is not necessarily the are length of the 


curve x = x(t). 


the tru 

THE 
not nec 
infinite 

We 
of the | 
numbe) 
Let Cm 
on the ¢ 
Tn in 


Lemma 


Li 
subse 
In 
subse 
unifo 
induc 

3, 4, 
funct 
Let 
in S», 
X(t) « 
LE! 
is 
so smi 
point 
Sup 
length 
lim J( 

(3.1) 
for all 
of are 

to x, 


ral 


VARIATIONAL THEORY IN THE LARGE. I 345 


LemMa 3.1. Corresponding to the sequence of representations S at least one 
subsequence converges to a “pseudo-limit” x(t). 

In accordance with Ascoli’s theorem (see Tonelli [1], p. 78) there exists a 
subsequence S, of S of the form #,(t), n = 1, 2,--- , such that Z,(¢) converges 
uniformly for 0 < ¢ < 1 to an absolutely continuous function X,(t). We proceed 
inductively, assuming that S,,; is a well defined subsequence of Sn», m = 
3,4,---. With Ascoli we infer the existence of a subsequence of S,,_; of vector 
functions converging uniformly, form — 1 < t < m, to an absolutely con- 
tinuous function X,,(t). We define X(t) for 0 < t < & by setting 


X(t) = X,,(t) (m-1St<m,m =1,2,---). 


Let S* be the subsequence ¢,,(¢) of S in which ¢»(¢) is the m* representation 
in S,. It is clear that the “diagonal sequence” ¢,,(f) converges uniformly to 
X(t) on each finite sub-interval for t, and that X(é) is a “‘pseudo-limit”’ of S. 

Lemma 3.2. Let C be a curve, not necessarily rectifiable, at each point of which f 
is P.S.N. Let \ be a positive constant. There exist positive constants 6 and n 
so small, that when a rectifiable curve g of length d lies on the 5-neighborhood of some 
point of C, then J(g) > n. 

Suppose the lemma false. There will then exist a sequence g, of arcs of 
length \ converging uniformly in point-wise fashion to some point x of C, while 
lim J(g,.) S$ 0. Since f is P.S.N. at 2» there exists a constant vector b such that 


(3.1) f(z, u) —b-u>k>0O, k constant, 


for all vectors u of unit length, and for all points x on some neighborhood N of a . 
Let n be so large that g, ison NV. Let x,(s) be the representation of g, in terms 
of are length. It follows from (3.1) that 


(3.2) > ds + Ka. 


But the integral on the right of (3.2) converges to 0 as g, converges point-wise 
to 2. Since k > 0, lim J(,) is positive. From this contradiction we infer 
the truth of the lemma. 

THEOREM 3.1. If a sequence of admissible curves C» converges to a curve C, 
not necessarily rectifiable, and if f is P.S.N. at each point of C then J(C») becomes 
infinite with L(C'm).* 

We apply the preceding lemma taking \ = 1 and using the constants 6 and 7 
of the preceding lemma. If C be divided into successive ares in any way, the 
number of these arcs with diameters d > 6/3 is bounded by some integer J. 
Let C be divided into successive arcs with L < 1 on the last arc, and L = 1 
onthe other ares. Suppose m is so large that C and C,,, admit a homeomorphism 
T,, in which corresponding points have a distance less than 6/3. 


‘The condition that f(z, r) be convex in r is not used in the proof of this theorem or of 
Lemma 3.2. 


in 
on- 
on- 
on 
nse 
m, 
ven 
tk, 
> of 
re- 

; on 
L(g) 
wn. 
ible 
ch t 
x(t) 
for 
nce 
y to 
han 
the __ 


346 GEORGE EWING AND MARSTON MORSE 


If the 7,,-map on C of a unit are h of C,, has a diameter d S 6/3, it is clear 
that h is within a distance 6 of some point of C. At most M of the unit ares of 
C, can accordingly fail to be within a distance 6 of some point of C. The re- 
maining unit ares of C,, each contribute at least 7 to J(Cm). As L(Cm) becomes 
infinite J(C,,) accordingly becomes infinite and the proof of the theorem js 
complete. 

If the admissible ares g on which A(g) S 0 are bounded in length by a constant 
, J will be said to satisfy the condition of Hahn. 

In the condition of Hahn we shall always take \ > 0. If J satisfies the condi- 
tion of Hahn, J(C) is bounded below regardless of the length of C. In fact, if 
B is the absolute minimum of f(x, r) for z on A and | r| = 1, then 


J(C) 2 min (0, Bd). 


An admissible curve whose length is a multiple of \ will be called a -are, 
By an initial sub-are of a curve h will be meant a sub-are of h with its initial 
point the initial poimt of h. The value of J on each initial d-are of a d-are C 
lies between 0 and J(C) inclusive. If H,, is composed of a sequence hy , «++ , hy 
of \-ares then 


J (Hm) J (hi) + + J(hm), 


none of the terms in the sum being negative. If J(H,,) is bounded independ- 
ently of m, and m becomes infinite, then for sufficiently large values of m, Hy, 
will possess a \-sub-are with an arbitrarily great length and an arbitrarily 
small value of J. 

THEOREM 3.2. Jf fis P.S.N. at each point x on A, and if J satisfies the condi- 
tion of Hahn, then L is bounded on any class of curves on which J is bounded. 

For earlier theorems of this type, see Hahn [1], Graves [1], McShane [4], and 
Tonelli [2]. 

If the theorem is false there exists a sequence C,, , m = 1, 2,--- , of A-ares 
on which J is at most a positive constant M, but on which L becomes infinite 
with m. In the set of \-sub-ares of the ares C,, , there accordingly exists a 
sequence g,, # = 1, 2, --- , of A-ares on which L becomes infinite with yu, and 
on which J tends to 0 with 1/yu. By virtue of Lemma‘3.1 a suitably chosen 
subsequence h,, p = 1, 2,--- , of the sequence g, possesses a ‘‘pseudo-limit” 
a(t) withO < t < x. It follows from the lower semi-continuity of J that on 
each sub-are of x(t) on which 0 S t S gq where q is a positive integer, J < 0. 
By virtue of the Hahn condition the length of the curve x(¢) is at most A. 

As t becomes positively infinite, x(¢) must then tend to a limit point c, and with 
the addition of c as an end point define a rectifiable curve C. By virtue of the 
definition of a “pseudo limit,” suitably chosen initial A-ares k, , p = 1, 2, -*:; 
of the respective arcs h, converge in the sense of Fréchet to C, while L(k») 
becomes infinite with p. But J(k,) is bounded by M contrary to Theorem 3.1. 
We infer the truth of Theorem 3.2. 

Combining this theorem with the well known theorem of Hilbert on the com- 
pactness of the curve class on which L S const., we have the following theorem. 


The 
which 
let b 
as to h 
from 


We cor 
quence 
(a) l 
ness for 
r for x 
To ok 
hypothe 
IVa. 
Our g 
I. He 
II. 
III. 
IVa. 


TH 
condit 
is con 

The 

Col 
and 7 
J is a 

Rec 
lary a 
lower 
curve 
condit 
Coroll 
contin 

(4.1) 
We : 
|r| = 
|u| = 
(4.2) 
for | r | 
that 


VARIATIONAL THEORY IN THE LARGE. I 347 


THEOREM 3.3. If f is P.S.N. at each point x of A, and if J satisfies the Hahn 
condition, then the class of admissible curves on which J is at most a finite constant 
is compact. 

Theorem 1.2 leads to the following corollary of Theorem 3.3. 

Coro.tiary 3.1. If f is regularly convex and non-linear in r at each point x of A, 
and if J satisfies the Hahn condition, then the class of admissible curves on which 
J is at most a finite constant is compact. 

Recall that the general conditions on f not explicitly mentioned in this corol- 
lary are those of homogeneity, convexity in r, boundedness for |r| = 1, and 
lower semi-continuity. We are not primarily concerned with the existence of a 
curve minimizing J and joining two given points of A. But under our general 
conditions on f and the special conditions of Theorems 3.2 and 3.3, or the above 
Corollary, such a minimizing curve exists. The compactness and lower semi- 
continuity necessary for the conventional proof are immediately available. 


4. Convergence in length and J-length 


The hypothesis that f(x, 7) be convex in r for each x has various consequences 
which we enumerate. As we have noted f(z, 7) is continuous in r. To continue, 
let u be a unit vector. For each fixed x and r, f(x, r + hu) has a right derivative 
as toh when h = 0. This derivative will be denoted by f’(z, r, u). It follows 
from Bonneson und Fenchel [1], p. 19 that 


(4.1) f(a, r) — f@,r —u) Sf'(@,7,u) Sf@,rt+u) —f@,r). 


We are assuming that f is homogeneous in the sense of (1.1) and bounded for 
|r|=1. It follows that the extreme members of (4.1) are bounded for | r | = 1, 
|u| = 1,andaonA. There accordingly exists a positive constant H such that 


(4.2) lf’@,r,u)| SH 


for|r| = 1,|u|=1landzonA. But it follows from the homogeneity of f(z, r) 
that 


kr, uw) = f'@, 7, u) (k # 0). 


We conclude that (4.2) holds without restriction on r. An immediate conse- 
quence of (4.2) is the following. 

(a) Under our conditions on f of homogeneity in r, convexity in r, and bounded- 
ness for |r | = 1, the function f is continuous in r, uniformly with respect to x and 
r for x on A and arbitrary r. 

To obtain the theorems on convergence in length and J-length we replace the 
hypothesis that f be lower semi-continuous by the following 

IVa. The function f shall be continuous in x for each fixed r. 

Our general hypotheses on f are now those of 

I. Homogeneity in r 

II. Convexity in r 

III. Boundedness for | r | = 1 

IVa. Continuity in z. 


ar 
of 
ies 
is 
nt 
li- 
if 
rc. 
1d- 
ily 
ite 
sa 
nd 
sen 
it” 
on 
0. 
ith 
the 
3.1. 
om- 
em. 


348 GEORGE EWING AND MARSTON MORSE 


We shall prove the following 

(b) Hypothesis IIT is a consequence of I, II and IVa. 

We shall show that if I, II and IVa hold, f is bounded for |r| < k where k 
is any positive constant. On account of the compactness of the set |r| < 
it will be sufficient to show that f is bounded for x on A and r neighboring an 
arbitrary point p. 

We first show that f is bounded above for 7 neighboring p. In the space R 
let 70, --* , Tn be the vertices of a simplex E containing 7 in its interior. Since 
f is continuous in 2, f(x, 7;) admits an upper bound M, for x on 4, 
andt = 0,---,m7. Because of the convexity of finr,f < M for x on A and 
ron E Hence f is bounded above for |r| < k. 

We continue by showing that f is bounded below for r neighboring p and x 
on A. Referring again to Bonnesen and Fenchel [1], p. 19, (2), we have the 
relation 


p + hu) — f@, p) = p) — f@, p— O<h<}) 


For us p is fixed, wu is an arbitrary unit vector and xison A. For such variables 
f(x, p) is bounded by virtue of IVa, and —f(x, p — u) is bounded below in ac- 
cordance with the results of the preceding paragraph, so that f(x, p + hu) is 
bounded below. Hence f(x, r) is bounded below for x on A and r on a neigh- 
borhood of p. 

Statement (b) follows. 

When I, II and IVa hold the conclusion of (a) holds. This taken with [Va 
implies the following 

(ec) Under hypotheses I, II and IVa f(x, r) is continuous in (x, r) for x on A 
and arbitrary r. 

If a sequence of admissible curves C,,, m = 1, 2, --- , converges to an ad- 
missible curve Cy in the sense of Fréchet, and if J(C,,) converges to J(Co), we 
say that C,, converges in J-length to Co). When J = L this defines convergence 
in length. 

If s is the arc length on an admissible curve C and one sets s = tL(C), the 
parameter t is called the reduced arc length on C. The reduced arc length varies 
from 0 to 1 inclusive. A representation x(t) of C in terms of reduced arc length 
will be called a reduced representation of C. The following theorem is well known: 
If C,, converges in length to Cy and z,,(t) and x(t) are reduced representations 
of C and Cy respectively then 


1 
(4.3) lim | |dm— dt = 0 


and x,(t) converges uniformly to x(t). (See McShane [3], p. 51.) Earlier 
results of the type of (4.3) are in Tonelli [1], p. 186, Adams and Lewy [1], p. 
24 and A. Morse [1], p. 72.) 

TuroreM 4.1. Under conditions I, II and IVa on f, convergence in length 
to an admissible curve Cy implies convergence in J-length to Co. 


Let 
length 
| We he 
bound 
To ( 

(4.3) 
The fi 
unifor 

so that 

(4.4) 
in abs 
1/m as 
with 1 
Let 
that at 

| (4.5) 
we shs 
LEM 
tinuou: 
Let . 
norma! 
(x, r) t 
throug 
vectors 
(20 To. 
is cont 
and we 
Top 
Tot 
fl 


VARIATIONAL THEORY IN THE LARGE. I 349 
Let Cn, m = 1, 2,---, be a sequence of admissible curves converging in 
length to Cy. Let z,(t),m = 0,1, ---, be the reduced representation of Cm . 
We have | #m | = L(C) for almost all values of ¢, 0 S t S 1, so that | ¢n | is 
bounded for almost all values of ¢. 
To establish the theorem one notes that 


(Cn) — J(Cs) = [ tn) — , at 
(4.3) 1 
+ tm) — fla, dt 


The first integral on the right of (4.3) tends to 0 with 1/m by virtue of the 
uniform continuity of f(z, r) for |r| bounded. It follows from (4.2) that 
| f(xo &m) — f(Xo , | H | im — | 


so that the second integral in (4.3) is at most 
1 
(4.4) H | | im — dt 


in absolute value. As we have noted before the integral (4.4) tends to 0 with 
1/m as Cy, converges in length to Cp). Thus J(Cn) — J(Co) converges to 0 
with 1/m and the proof of the theorem is complete. 

Let IIa denote the condition on f of regular convexity. Condition Ila implies 
that at (xo , 7) the “figurative” z = f(x, r) has but one “supporting” n-plane 


(4.5) — = a(to,%)-(r — 7) XO = 


we shall prove the following lemma. 

Lemma 4.1. Under the conditions I, IIa, and IVa on f, f, exists and is a con- 
tinuous function of (x, r)r ¥ 0. 

Let A(xo , 7) be a unit vector on the space (r, z) defining the direction of the 
normal to the supporting plane (4.5) taken with a positive z-component. As 
(x, r) tends to (ao , 7), A(x, r) tends to A(x, 7). For if (x, r) tends to (x , 70) 
through a discrete set, any cluster value A» of the corresponding sequence of 
vectors A(x, r) defines a normal to an n-plane supporting the figurative at 
(to, 70). Under condition IIa, it follows that Ao = A(x, 7). Thus A(z, r) 
is continuous in (x, r). Since each supporting n-plane is of the form (4.5) at 
each point (x, r) it follows that the component A‘”*” (x, r) # 0. Finally 


(x, r) 


(3) 
a” (x, 7) AD 


and we conclude that a(z, r) is continuous in (2, r). 
To prove the lemma it remains to establish that f,(x, r) exists and equals a(z, r). 
To that end let C be the convex curve in the space R cut out of the figurative 
2 = f(a, r) by the 2-plane II through r = 7% on which z and r alone vary. 


ek 
an 
nce 
A, 
ind 
lez 
the 
1) 
les 
ac- 
) is 
Va 
ud- 
we 
nce 
he 
ies 
rth 
mn: 
lier 
p. 


350 GEORGE EWING AND MARSTON MORSE 


The curve C is supported at the point (x , 7) by a unique line L. For through 
every line L in II supporting C at (a, 7) passes an n-plane supporting the 
figurative at (xo, ro). If wu is a unit vector in the direction of the r™ axis it 
follows that 


(xo, To » u) —f'(m, To » —u) 


so that f,<:) (ao, 70) exists and equals the directional derivative f’(2 , 7, u). 

Finally this derivative equals a(x, ro). For the 2-plane II intersects the 
n-plane (4.5) in a line L supporting C at (zo , 7) with a slope of a‘ (2 , ro) with 
respect to the r™ axis. It follows that 


a (29 = To) 


and the proof of the lemma is complete. 

We seek conditions on f under which convergence in J-length to an admissible 
curve Cy implies convergence in length to Cy). We begin with an extension of a 
lemma of McShane, [1], p. 9, making use of the Weierstrass Z-function E(z, r, q). 

Lemma 4.2. Let C,,m = 1, 2, be a sequence of curves with reduced repre- 
sentations x(t) and lengths at most M, such that x(t) converges uniformly to a 
representation xo(t) defining a curve Cy. Among values of t at which xp exists let w 
be the set at which i) = 0 and let o be the residual set. Let b(t) be an arbitrary 
bounded measurable vector function of t. 

Then* 


lim [J(Cn) — J(Co)] 2 lim E(x0, £0, Xm) dt 


(4.6) 
+ lim tm) — 


the equality prevailing when mw) = 0. 
By elementary additions and subtractions of the integrands concerned on the 
separate sets o and w we find that 


[f(tm; tun) (x0, dt 
+ [ [f(x0, — b(t) dm] dt + (tm — dt 


+ | ta) dt 


5 The derivative z» may not exist on a set E of measure 0 and on such a set the integrands 
in (4.5) are not defined. We understand that these integrands are set equal to an arbitrary 
constant on £. 


where 
explici 


Th 
conve 
tinuit 
integ) 
LE) 
then a 
Cas 
Co 
(4.8) 
on a . 
We 
does r 
hold, 
(4.7) « 
Hence 
We: 
convex 
differir 
this co 
differir 
In p 
to Lus 
of mea 
exceed: 
exceedi 
The 
THE‘ 
at each 


the 


rary 


VARIATIONAL THEORY IN THE LARGE. I 351 


The first integral on the right tends to 0 with 1/m by virtue of the uniform 
convergence of z(t) to x(t), the fact that | ¢, | < M, and the uniform con- 
tinuity of f(x, r) over the set of variables concerned. The sum of the last two 
integrals tends to 0 with 1/m by virtue of the convergence of xm to 2 and a 
theorem in Hobson [1], §279 already cited. The lemma then follows. 

Lema 4.3. If the curves Cy, of the preceding lemma satisfy the condition 


(4.7) L(Cm) = L(Co) + (m= 1,2,---) (e >0) 


then at least one of the two following cases occur. 

Case I. The derivative ip = 0 on a set w of positive measure.® 

Case II. There exist an integer N and positive constants n and 6 such that 
on @ set wm of measure exceeding 54. 

We shall apply Lemma 4.2 to the function J = L. Assuming that Case I 
does not hold we shall prove that Case II must hold. When Case I does not 


hold, m(w) = 0. Let om be the subset of o on which z, # 0. According to 
(4.7) and Lemma 4.2 we have 


(4.8) 27 (m > N) 


lim [L(Cq) — L(Co)] = lim B*(x, to, tim) dt 


where E* is the Weierstrass E-function set up for L. Developing E* more 
explicitly and noting that | ¢, | < M almost everywhere on o» we have 


< lim [ [ artim f 


Hence (4.8) holds as stated and Lemma 4.3 follows. 

We shall say that f is strongly convex at a point (x , ro) at which ro ¥ 0, if the 
convexity relation (1.2) holds with the equality excluded for each vector r 
differing in direction from 7). Expressed in terms of the Weierstrass E-function 
this condition requires that 7 + 0 and that E(x, ro, 7) > 0 for each vector r 
differmg in direction from 7 . 

In proving the next theorem we shall make use of the following theorem due 
to Lusin. If h(t) is a finite measurable function defined on a bounded set g 
of measure exceeding a positive constant 6, there exists a subset of g of measure 
exceeding 6 on which h(t) is continuous, and hence a closed subset of g of measure 
exceeding 6 on which h(¢) is continuous. See Saks [1], p. 44 for a proof. 

The following theorem generalizes a theorem of Lindeberg. Cf. Tonelli [1] 
p. 321 and McShane [1] p. 40. 

TuroreM 4.2. Under the general conditions I, IIa, and IVa, if f is P.S.N. 
at each point of an admissible curve Cy: x = ¢(s), and if f is strongly convex at 


tm 


Xo 
dt. 
| | | 


6 That Case I actually occurs can be shown by examples. 


th 
1e 
it 
he 
th 
le 
1). 
re- 
a 
5 
_| 
) dt 


352 GEORGE EWING AND MARSTON MORSE 


[e(s), o(s)] for almost all values of the arc length s for which o(s) ¥ 0, then conver- 
gence in J-length to Co implies convergence in length to Co . 

If the theorem is false there will exist a sequence C,,, m = 1, 2, --- of ad- 
missible curves converging in J-length to Cy while a condition such as (4.7) 
holds for each m. The curves C,, are bounded in length by a positive constant 
M in accordance with Theorem 3.1. Hence a subsequence of these curves, 
which we again denote by C,,, m = 1, 2,--- , can be parameterized together 
with Cy, as in Lemma 4.2. See McShane [1], p. 10. Lemmas 4.2 and 43 
then apply. We have the two cases of Lemma 4.3. 

Since f is P.S.N. at each point of Co, there exists a function b(t) which is 
constant on each of a finite set of intervals covering [0, 1] such that for each t 
and r ~ 0 


(4.9) r] — b@)-r > kl r| 


where k is a positive constant. 
We make use of (4.6) taking b(¢) in (4.6) as the b(¢) of (4.9). Then 


(4.10) lim [J(C,) — J(Co)] & lim f B(x, sn) dt + m(a)k lim L(Cq) 


CasE I. In this case m(w) > 0. Since L(C,,) is bounded below by e, and 
E = Oin (4.10), we infer that J(C,,) does not tend to J(Co), contrary to hypothe- 
sis. The theorem follows in Case I. 

Case II. We can assume that Case I does not hold so that m(@w) = 0. There 
exists a closed subset 7,, of the set wm of Lemma 4.3 on which 4» is continuous and 
M(tm) > 6. The set (a, %, u) for which ¢ is on 7, and 


—|2 uj =1 
| =” | | 


is closed; hence on this set E'(xo , % , wu) > c where c is a positive constant. But 
for almost all values of ¢ on 7» , | @m | = L(Cm), so that for these values of ¢ and 
for m > N as in (4.8), 


E(xo 40, im) > = cL(Cm) 2 ce (m > N). 


It follows from (4.10), setting m(w) = 0, that J(C,,) does not converge to 
J(Co), and from this contradiction we infer the truth of the theorem. 


INSTITUTE FOR ADVANCED Stupy, 
UNIVERSITY OF MissouRI. 


BIBLIOGRAPHY 


Apams, C. Lewy, H. 

1. On convergence in length, Duke Mathematical Journal 1 (1935), 19-26. 
BonNESEN, T. UND FENCHEL, W. 

1. Theorie der konvexen Kérper, Springer, Berlin (1934). 
C. 

1. Vorlesungen tiber reelle Funktionen, Teubner, Leipzig (1918). 


GRA 


Hos: 
McS) 
4 
1 
Mors 
1 

Mors 
2 
Saks, 
1. 
1. 
2. 


VARIATIONAL THEORY IN THE LARGE. I 353 


Graves, L. M. 
1. On an existence theorem of the calculus of variations, Monatshefte fur Mathematik 
und Physik, 39 (1932), 101-4. 
Using the hypotheses of Hahn [1], the results are extended to the n-dimensional 
case with the aid of two lemmas. 
Hann, H. 
1. Uber ein Existenztheorem der Variations-rechnung, Sitzungsberichte der Akademie 
der Wissenschaften in Wien, 134, Abt. 2A (1925), 437-47. 

The conclusion of our Theorem 3.2 is obtained (p. 439) for n = 2 under the hypoth- 
esis that f be of class C’ and that J be positive semi-definite and positive quasi- 
regular, with the further conditions that f(z, r) shall not vanish identically in r at 
any point z and that admissible ares g on which J(g) = 0 are bounded in length. 
Our ‘condition of Hahn’ is the generalization of this last condition as found in 
Tonelli [2], p. 94. 

Hosson, E. W. 
1. The theory of functions of a real variable and the theory of Fourier series, volume II, 
Cambridge, (1926). 
McSuHange, E. J. 
1. Semi-continuity in the calculus of variations and absolute minima for isoperimetric 
problems, Contributions to the calculus of variations, University of Chicago, (1930). 
2. Semi-continuity of integrals in the calculus of variations, Duke Mathematical Journal 
2 (1936), 597-616. 
3. Curve-space topologies associated with variational problems, Annali della R. Scuola 
Normale Superiore di Pisa, Serie IT, 9 (1940), 45-60. 
4. Remark concerning Mr. Graves’ paper ‘“‘On an existence theorem of the calculus of 
variations,’’ Monatshefte fur Mathematik und Physik, 39 (1932), 105-6. 
It is shown that the conclusion of Lemma B of Graves [1] is a consequence of 
weaker hypotheses than those of Lemma A. 
MeNGER, K. 
1. Metric methods in calculus of variations, Proceedings of the National Academy 
of Sciences, 23 (1937), pp. 244-250. 
Morss, A. 
1. Convergence in variation and related topics. Transactions of the American Mathe- 
matical Society 41, (1937), 48-83. 
Morsz, M. 
1. Functional topology and abstract variational theory, Mémorial des Sciences Mathé- 
matiques, XCII, Gauthier-Villars, Paris, (1939). 
2. Functional topology. Bulletin of the American Mathematical Society, 49 (1943), 
pp. 144-149. 
Saks, S. 
1. Theorie de Vintegrale, Monografje Matematyezne, Warszawa (1933). 
TonELLI, L. 
1. Fondamenti di calcolo delle variazione, Volume I, Zanichelli, Bologna, (1921). 
2. Sull ’esistenga del minimo in problemi di calcolo delle variazione, Annali della R. 
Scuola Normale Superiore di Pisa, Serie II, 1 (1932), 89-90. 

On p. 94 is found a theorem similar to our Theorem 3.2 with n = 2. The “‘condi- 
tion of Hahn”’ is assumed and J is ‘‘positive quasi-regular semi-normal.” If n = 2 
and if f have the requisite derivatives our hypotheses are equivalent to those of 
Tonelli. 


ad- 
1.7) 
ant 
ves, 
her 
43 
is 
ch t 
) 
and 
the- 
here 
and 
= | 
But 
and 
N). 
ze to 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


THE VARIATIONAL THEORY IN THE LARGE INCLUDING THE 
NON-REGULAR CASE—SECOND PAPER 


By Greorce Ewine anp Marston MorsE 
Received July 8, 1942 


Introduction 


In the first paper with this title the authors have reviewed and simplified the 
classical conditions for curve-space compactness, together with the conditions 
that convergence in length and convergence in J-length be equivalent. The 
hypothesis of convexity of the integrand f(z, r) in r so effectively used by Me- 
Shane [1] and others is here carried still further. Other notions such as that of 
a pseudo-limiting curve play an important réle in simplifying the theorems and 
their proofs. 

The present paper is concerned with the more novel problems of upper- 
reducibility and of the generalized Euler theorem or homotopy theorem. These 
problems belong peculiarly to the variational theory in the large. The original 
treatment of these problems in Morse [3] used methods which in part break down 
in non-regular problems. The underlying topological structure however re- 
mains essentially the same, subject to certain difficulties which have been sur- 
mounted in Morse [4] in preparation for the present paper. This paper by 
Morse gives the underlying definitions and the topological existence theorems 
for critical points. 

The theorems of the present paper are first established in a Euclidean region. 
We turn next to a compact Riemannian manifold 2. It would be possible to 
treat the problems of upper-reducibility and the homotopy theorem on = di- 
rectly, with an appropriate use of an elaborate tensor analysis and with geo- 
desics replacing straight lines. The necessity of doing this is avoided by the 
introduction of a new lemma, Lemma 7.2, on the choice of coordinate systems 
covering a curve. With the appropriate invariant formulation of the previously 
locally defined conditions of convexity, semi-normality etc., the theorems are 
finally stated and proved for . 


1. The metric spaces M, L, J 


We begin with certain definitions which can be given best in terms of functions 
on a metric space M.: Let the points of M be denoted by Greek letters a, 8, 
etc., and let designate the distance between a and 8 on M. We understand 
that a satisfies the usual metric axioms. We shall be concerned with two func- 
tions J(a) and L(a) of the point a of M. We suppose that J(a@) and L(q) are 
finite, single-valued and lower semi-continuous on M. In later sections @ will 
be identified with a curve joining two points on a Riemannian manifold, 8 
will be the Fréchet distance between the curves a and 8 while J(a) and L(a) 
will be integrals along a, of which L(a) is the length of a. We here suppose that 
L = 0, and that J(a) is bounded below. 

354 


dist: 


and 


B 
| 
We: 
with 
will 
defit 
say 
simy 
Le 

we S 
prod 
foun 
We : 
(a 

(b 
with 
W 

J (a) 
term 
0 uni 
Le 
uppe 
const 
neigh 
Wet 
inter 
J-det 
empt 
As 
uppe 
condi 
lack 
regul: 
proof 
to th 
to th 
kins | 


VARIATIONAL THEORY IN THE LARGE. II 355 


Beside the metric of M we shall need two other metrics, an L-metric with a 
distance 


| | = + | L(a) — L(B)| 
and a J-metric with a distance 


(a8) = o8 + | J(a) — 


We shall refer to the corresponding spaces asthe spaces L and J. In connection 
with the metric spaces M, L or J, terms such as neighborhood, compact, ete., 
will be preceded by the letter 1, L or J according to which metric is used to 
define these terms. If for a fixed 8, a8 tends to 0 and J(a) tends to J(@), we 
say that a converges in J-length to 8. The term convergence in L-length (or 
simply convergence in length) is thereby defined, taking J = L. 

Let E be asubset of L. Let J, be the interval0 rSa,a>0. ByE XI, 
we shall mean the product of F and J, and shall assign the usual metric to this 
product space. We shall admit deformations D of & which replace a point a 
found on E at the time r = 0, by a point g(a, 7) on L at the time 7,0 S 7 S a. 
We also require the following 

(a). The point function g(a, r) shall map E X I, continuously into L. 

(b). For each fixed a on E, g(a, r) shall map I, continuously into M, uniformly 
with respect to a and r. 

We term D a weak J-deformation if for each fixed a on E, and for ¢ = ¢(a, 7), 
J(a) — J(g) is negative for no value of rt on J,. A weak J-deformation is 
termed proper if the above difference J(a) — J(¢) is positive and bounded from 
0 uniformly for a on E, if 7 on J, is bounded from 0. 

Let J° denote the subset of points of M for which J < b. We say that J is 
upper-reducible at a and on J’ if b > J(a) and if corresponding to an arbitrary 
constant a with b > a > J(a) there exists a weak J-deformation of some M- 
neighborhood N of a relative to J’ which is a proper deformation of N - (J* — J*). 
We term J upper-reducible at a if J is upper-reducible at a on each set J*. In 
interpreting these definitions we understand that the null deformation is a weak 
J-deformation, and that any weak J-deformation is a proper deformation of an 
empty set. 

As is well known the integrals of the calculus of variations are not in general 
upper semi-continuous. They are however upper-reducible under very general 
conditions as we shall see, and this upper-reducibility fills the gap caused by the 
lack of upper semi-continuity. The earlier proofs of upper-reducibility in the 
regular case made use of elementary extremals. Since these elementary ex- 
tremals are not in general available in the non-regular case, a different type of 
proof is necessary. The proof given in the next section is more closely related 
to the proof of the upper-reducibility of the Douglas-Dirichlet integral than 
to the earlier proof for regular simple integrals. See Morse and Tomp- 
kins [1]. 


he 
ns 
he 
[c- 
of 
nd 
er- 
ase 
1al 
wn 
re- 
ur- 
by 
ms 
on. 
to 
i. 
he 
ms 
sly 
ure 
yns 
’ 
nd 
ne- 
are 
vill 
ap 
(a) 
nat 


356 GEORGE EWING AND MARSTON MORSE 


2. A deformation problem 


The parameters in each local representation of our Riemannian manifold 5 
will be coordinates (x, --- , 2°”) on a bounded region S of Cartesian n-space, 
In this local system we shall use the notation of vector analysis’ letting x repre- 
sent the vector whose coordinates are (x, --- , «‘). It will simplify matters 
and cause no loss of generality if S is assumed convex. 

We shall admit curves on S with vector representations x(t), 0 ¢t S in 
which each x(t) is absolutely continuous. The ¢derivative of x(t) will be 
denoted by #(t). A particularly useful representation is that in terms of re- 
duced length. Let a be a curve with length L(a). To obtain a representation 
of a in terms of reduced length ¢ one sets s = tL(a), where s is the arc length on 
a measured from the initial point of a. The parameter ¢ then ranges from 0 to 1 
inclusive. If x(t) is the reduced representation of a, | | = L(a) for almost all 
values of t. 

We have introduced various distances between curves. To state our results 
in the briefest fashion it will be convenient to introduce a distance between two 
representations x(t) and z(t) of two curves given with the same interval [0, h] 
for t. This distance will be defined by the relation 


d[x(t), 2()] = max | x(t) — (0 <t<h) 


If for a fixed 2(t), d(x, z) tends to 0, x(¢) converges uniformly to z(¢). 

As a matter of permanent notation let 8 be a fixed rectifiable curve and aa 
variable curve, both of positive length. Let x(t) and y(t) respectively be admis- 
sible representations of a and 8, withO St 1. Weshall have occasion to con- 
sider a deformation in which @ is fixed and a is replaced at the time r by a curve 
with the representation 


(2.1) na(t) = x(t) + — 


It is clear that such a deformation of a depends upon the representation x(t) 
and y(t), and not merely upon a and £ as points of M or L. We would like to 
choose representations of a and 8 such that the following three conditions are 
fulfilled. 

(A) The deformation (2.1) is admissible in the sense of (a) and (b) of §1. 

(B) For any class of curves a for which L(a) is bounded and af sufficiently small, 
| ta | ts bounded independently of t, r, and a. 

(C) As a tends to 0, dly(t), na(t)] tends to 0 uniformly with respect to r. 

We have been unable to find representations of a and 6 such that (A), (B) and 
(C) are satisfied simultaneously. If a and 8 are represented in terms of reduced 
length one could show that (A) and (B) are satisfied but that (C) fails in general. 
To meet this difficulty we modify our approach as follows. We seek not one 
representation y(t) of 8 but many, in fact a representation y.(t) of 8 which is 


1 When dealing with = in §7 and thereafter we shall use the notation of tensor analysis. 


det 
the 
(2.: 
Th 
hat 
att 
seq, 
(B) 
( 
( 
Ta ( 
fast 
I 
rep: 
rest 
u-re 
(3.1 
whi 
tior 
V 
tive 
on ¢ 
a, le 
B; a 

shal 

its i 
tion 

are 

[2], 
an 

and 
cury 
leng 
para: 


all, 


und 
ced 
ral. 
one 
is 


sis. 


VARIATIONAL THEORY IN THE LARGE. II 357 


determined by a and 6 together. We are thus concerned with a deformation of 
the form 


(2.2) = x(t) + rlya(t) — 


The representation x(t) of a shall be in terms of reduced length. Because we 
have not been able to satisfy (C) as well as (A) and (B) we shall abandon the 
attempt to find a single deformation satisfying (A), (B) and (C) and seek a 
sequence of deformations D(m, 8), m = 1, 2, --- each of which satisfy (A) and 
(B) and which are such that the following is true. 

(D) As a6 and 1/m tend to 0, d(ya , x2) tends to 0 uniformly with respect to r. 

Our representation y.(¢) of 8 will depend both on a and m, so that in D(m, 8), 
z(t) will depend on the parameter m. The parameter m will ordinarily be held 
fast and so will not be explicitly indicated. 


3. The (a, m)-representation ¥.(t) of 8 


In accordance with the preceding section a is a variable curve with a reduced 
representation x(t), and @ is a fixed curve whose representation y.() will depend 
on a and on an integer m. As an aid in defining y.(¢) we make use of earlier 
results of Morse [2] whereby the point x of S on a can be represented in terms of 
a p-parameter’ as follows. Let J, represent the interval 0 < » < 1. In the 
u-representation of a, x is given by a function 


which maps J, X M continuously into S. For a fixed a, X(u, a) is a representa- 
tion of a, and as stated X (u, a) varies continuously on S with u on J, and aon M. 

We make use of the representations X(u, a) and X(u, 8) of a and B respec- 
tively to divide a and B into m successive arcs a; and 6; respectively (i = 1, --- m) 
on each of which Au = 1/m. Recalling that ¢ represents the reduced length on 
a, let (#_; and tf be the values of ¢ on a at the end points of a;. We parameterize 
8; as follows. The parameter on 8; will be denoted by ¢ and have the values 
7, and tf at the end points of 8;. At an inner point p of 8; the parameter ¢ 
shall be such that ¢ — ¢7_; is proportional to the are length on 8; measured from 
its initial point to p. The resulting representation of 8 is our (a, m)-representa- 
tion y.(t) of 8. Let L* denote the subset of curves of L of positive length. We 
are supposing that a@ lies on L”. 

We shall recall certain properties of convergence in length. See McShane 
[2], p. 51. If an are with a reduced representation »(t) converges in length to 
an arc with a reduced representation ¢(¢), then »(¢) converges uniformly to ¢(¢) 
and 4(t) converges to ¢(¢) in the mean (understood of the first order). If a 
curve a converges in length to a curve 7, the length of any subarc tends to the 
length of any subare of y to which it converges in the sense of Fréchet as ay 


? Actually the present parameter is reduced u-length, bearing the same relation to the 
parameter » of Morse [3] that reduced length bears to length. 


re- 
ers 
in 
be 
re- 
ion 
on 
ol 
all 

wo 
h) 
x (3.1) X(u, a) 
Lis- 
on- 
rve 
1) 
to 
are 


358 GEORGE EWING AND MARSTON MORSE 


tends to 0. With this understood we see that, for m and @ fixed, t7 and L(a;) 
are continuous functions of a on L. Two first properties of the (a, m)-repre- 
sentation ya(t) of the fixed curve 6 result as follows. 

(i) For a fixed m, ya(t) is continuous in t and a, for t on [0, 1] and a on L. 

(ii) For a fixed m and curves a and y on L*, ya(t) converges uniformly to yy(t) 
and Ya(t) converges in the mean to y,(t) as a converges in length to y. 

For almost all ¢ on the interval [¢t7_, , tf] 


L(B;) 
L(a). 


If then a is restricted to a class L‘ of curves of L* whose lengths are at most « 
then for almost all ¢ 


(3.2) | Ya(t) | = 


L(6;) 
< 
(3.3) | Ya(t)| Sx L(a:) [L(a) ¥ 0]. 
For each m there exists an M-neighborhood N,,, of 8 so small that when a is on 
Nm 
(3.4) L(a:) > 


by virtue of the lower semi-continuity of L(a;). With the aid of (3.3) and (3.4) 
we then have the result.” 

(iii) Corresponding to the curves a on L* and the integer m there exists an M- 
neighborhood Nn, of 8 such that for almost all t 


(3.5) | Ya(t)| < 2x [a on (L*-N,,)]. 


As af tends to 0 the euclidean distance between the points of a and 8 respec- 


tively bearing the parameter ¢; tends to 0 uniformly for all m > 0 andi = 
1,---,m. Moreover the euclidean diameters of the arcs a; and @; used in the 
definition of y(t) tend to 0 with a8 and 1/m. We thus obtain a final property 
of ya(t). 

(iv) The distance d[x(t), ya(t)] tends to 0 as a8 and 1/m both tend to 0. 


4. The deformation D(m, 8) 


We can now define a deformation D(m, 8) satisfying conditions (A), (B) and 
(D) of §2. In this deformation 8 is a fixed curve with a variable representation. 
The time 7 varies on the interval [0, 1]. The curve a to be deformed has a 
positive length and a reduced representation x(t). For each fixed positive 
integer m we use the (a, m)-representation ya(t) of 8. At the time r the image 
g(a, 7) of a under D(m, 8) shall have the representation 


(4.0) ta(t) = x(t) + — x()] (0 <7 1) 


The condition (a). We begin by showing that ¢(a, 7) satisfies (a) of §1. 
To that end let a and a be curves of L*, and let + and 7» be values of 7 on 


(0, 1 


= 
Set 
Refe 
(4.1) 
If a 
3), 
(4.2) 
Tc 
lengt 
Tl 
(4.3) 
(4.4) 
mean 
conve 
lengt. 
Th 
Th 
once 
that: 
Of 
Cond 
show 
(4.5) 
distar 


t) 
[- 


on 


VARIATIONAL THEORY IN THE LARGE. II 359 


(0, 1]. Let x(¢) and z(t) be reduced representation of a and a» respectively. 
Set 


T— % = Ar, x(t) — a(t) = Ax 
watt) — = Ara,  Yalt) — Yag(t) = Aya 
Referring to (4.0) we obtain 
(4.1) Aza = (1 — m)Ax + (Ya — x)At + Aya. 
Taking maxima as ¢ ranges over [0, 1] we have 
(4.2) max | Ara | (1 — 7) max | Ar] + | max |x — ya| 
+ 7 max | Aya|. 

If a converges in length to a» , max | Az | and max | Ay, | will tend to 0, (see (i) 
§3), and if | A7 | also tends to 0, max | Azg | will tend to 0 in accordance with 
ora show that ¢(a, 7) satisfies (a) it remains to show that g(a, 7) converges in 


length to y(ao , 79) as a converges in length to a and 7 converges to 7» . 
The lengths of g(a, 7) and ¢(a» , 7») have an absolute difference 


1 1 1 
[ 12%, at] s | — | ae 
From (4.1) we find that 


(4.4) = (1 7) + (Ya z)Ar + TAYa 


for almost all ¢. But in accordance with (ii) of §3, « and y_ converge in the 
mean to % and Ya, respectively as a converges in length to a). If in addition 
| Ar| converges to 0 it follows from (4.4) that the integral 


1 
[ 


converges to 0. Hence g(a, 7) converges in length to y(ao , 7») as a converges in 
length to a» and 7 converges to 7. Cf. McShane [2], p. 51. 

Thus condition (a) is satisfied. 

The condition (b). That condition (b) of §1 is satisfied by g(a, 7) follows at 
once from (4.2) upon setting | Az | and | Ay, | equal to 0 in (4.2) and observing 
that | — y. | is bounded since S is bounded. 

Of the conditions (A), (B) and (D) of §2, (A) is thus established for g(a, 7). 
Condition (B) is satisfied as a consequence of property (iii) of ya(t) in §3. To 
show that (D) is satisfied note that 


(4.5) d(ya, = (1 — 7) max | x(t) — ya(t)| = (1 — 7)d(2, ya) 


where the maximum is for0 < ¢ S$ 1. Property (iv) of §3 of y. shows that this 
distance tends to 0 with a8 and 1/m. Thus (A), (B) and (D) are satisfied by 
the deformation ¢(a, 7). 


(4.3) 


/ 

K 
n 

| 
1€ 
ty 
id 
n. 
a 
ve 
ze 

1) 


360 GEORGE EWING AND MARSTON MORSE 


We add the lemma. 
Lemma 4.1. The image g(a, 7) of a under D(m, 8) satisfies the relation 


(4.6) L(y) L(a) + — L(@)] S max [L(a), L(@)] O71), 
We see that 


Le) = = + rial a 
(4.7) 
lalate [ a 


Relation (4.6) follows. 
Relation (4.7) shows that L(g) is convex in 7. 


5. The upper-reducibility of J 


To define our integral J we require a function f(x, r) of the point x on S and 
of a vextor r. Let S’ be a region containing the closure S of S. We suppose 
that f is of class’ C’ for x on S’ and any r ~ 0. The function f shall be homo- 
geneous in the sense that 


f(a, kr) = Kf(z, r) (k = 0). 


We also suppose that fis a convex function of r. This implies that the Weier- 
strass E-function is never negative. Under these hypotheses J(a) is a con- 
tinuous function of a on L and a lower semi-continuous function of a on M on 
any subset L* of M. See Morse and Ewing [1]. We shall prove that J(a) is 
upper-reducible at each curve 6 of L*. 

We make use of the deformation D(m, 8) for a fixed m and 8. Recall that 
the subset of curves of positive lengths at most « is denoted by L*. As a func- 
tion of 7, J(¢) is almost convex in the sense of the following lemma. 

Lemma 5.1. Jf a and B belong to L* and g(a, r) is the image of a under the 
deformation D(m, 8), then 


(5.1) J(a) + JB) — J(a)] + (087 


where h = 0 and is less than a preassigned positive constant e if m is sufficiently 
large and a is on a sufficiently small M-neighborhood 2, of 8. 
By virtue of the convexity of f(z, r) in r, 


The differences 
(5.3) 23(t) — 2) = rya—2), 22(t) — yall) = (1 — — ya) 


3 By virtue of the homogeneity of f in r, fz exists and is continuous even when r = 0. 


vit 
witl 
whe 
fact 
In t 
pro. 
the 
whe 
of I 
initi 
and 
initi 
neal 
T 
For 
a of 
(0, 1 
J(a@ 
J (a. 
cont 
by ¢ 
imag 
Li 
(5.7) 
B 
be le 


1), 


nd 


tly 


= 0. 


VARIATIONAL THEORY IN THE LARGE. II 361 


converge to 0 uniformly with respect to ¢ as a8 and 1/m tend to 0 in accordance 
with property (D) of the deformation D(m, 8). Hence for x = z(t) 


(5.4) [ [ iat 


where R; 2 0 and is of the character of 7h in (5.1). Use is thereby made of the 
fact that | ¢ | S « for almost all ¢, and that for |r| < x and zon S,f, is bounded. 
In the second term on the right of (5.2), | y. | S 2x in accordance with (iii) of §3, 
provided m is sufficiently large and a is on N,,-L". The second term is thus of 
the form 


(5.5) 7[J(8) — J(a)] + B, m) 


where | h; | has the character of h in the lemma. The lemma follows. 
It follows from (5.1) that D(m, 8) is a proper J-deformation of any point a 
of L* on a sufficiently small M-neighborhood of 8 for which 


(5.6) J(a) — J(8) > h(a, B, m) 


initially. But there may be points ae arbitrarily near 8 at which J(a) < J(8), 
and for such points we could not affirm that D(m, 8) was even a weak J-de- 
formation of a. We are able however to modify D(m, 8), depending upon the 
initial value of J(a), obtaining thereby a deformation D(m, 8, a) which more 
nearly meets our needs. 

The deformation D(m, 8,a). Let a and c be constants such that 


tt J) 


For each a on L* we shall now define a value 7(a) of the time 7. For points 
a of L* at which c S J(a) S alet r(a) be a value of 7 which divides the interval 
(0, 1] in the same ratio as that in which J(a) divides the interval [c, a]. For 
J(a) > awe take 7(a) as-1, and for J(a) < c we take r(a) as0. Recalling that 
J(a) is continuous in a on L, by virtue of hypotheses at the beginning of §5 
together with Theorem 4.1 of Morse and Ewing [1], we see that 7(a) is also 
continuous on L*. The deformation D(m, 8, a) is now obtained from D(m, 8) 
by deforming a as under D(m, 8) until the time 7(@) is reached, and holding the 
image of a fast thereafter. This deformation has the following properties. 

Like D(m, 8) it satisfies conditions (a) and (b) of §1 and so is admissible. 

In terms of the preceding constant c we write (5.1) in the form 


(5.7) J(g) — J(a) S [J(B) — ¢ + + ale — J(@)). 


By virtue of Lemma 5.1, if m is sufficiently large, say m > m, and a is on 
0¢,-L*, where 9%! is a sufficiently small M-neighborhood of 8, then h(a, 8, m) will 
be less than the positive constant c — J(8) and (5.7) will take the form 
(5.8) J(¢) — J(a) — J(a)] (m > m). 


a> J(8), 


_| 
ose 
n0- 
0). 
er- 
on- 
on 
nat 
the 
1) 


362 GEORGE EWING AND MARSTON MORSE 


Hence for m > m, , D(m, 8, a) is a weak J-deformation of 2°,-L". This follows 
for J(a) > ¢ from (5.8), and for J(a) < c¢ from the fact that D(m, 8, a) is the 
null deformation. From (5.8) we see that form > m;, D(m, 8, a), is a proper 
deformation of the subset of 0%,-L* for which J(a) = a. 

To establish the upper-reducibility of J on L at a curve 8 we assume that L 
is bounded with J. Conditions on f that L be bounded with J are given in 
Morse and Ewing [1], Theorem 3.2. When L is bounded with J, a set J” lies 
on some set L S « for « sufficiently large. To establish the upper-reducibility 
of J at a curve 6 of positive length we refer to the definition of upper-reducibility 
and choose an arbitrary constant a > J(8) and a second arbitrary constant 
b >a. To apply Lemma 5.1 we choose « so large that J’ is on the set L < x, 
We then apply the deformation D(m, 8, a) with m > m,. Under this deforma- 
tion 9%,-L* is weakly J-deformed, while 9,-(J°” — J*) suffers a proper J-deforma- 
tion. Hence J is upper-reducible on J” at 8. But a and b are arbitrary con- 
stants subject to the condition b > a > J(@). Hence J is upper-reducible on 
L at 8, in accordance with the definition of upper-reducibility. 

We thus have the following theorem. 

TuHeorEM 5.1. Jf L is bounded with J, J is upper-reducible on L at each recti- 
fiable curve 8 of positive length.* 


6. The homotopy theorem 


The function J is said to be homotopically ordinary at a point 6 of L if there 
exists a proper J-deformation of some J-neighborhood of 8. If not homo- 
topically ordinary at 8, J will be termed homotopically critical. The funda- 
mental theorem of this section is that under suitable conditions on f a point 8 
which is homotopically critical is an “extremal”. This is a semi-topological 
generalization of the theorem of Euler that, under suitable conditions on f, a 
minimizing curve is an extremal. For a minimizing curve clearly defines a 
homotopic critical point. 

We shall need the condition that f(x, r) be strongly convex at (x, p), p # 0. 
Since we are assuming that f is of class C’, the required condition is that E(z, 
p, r) 2 O for the given (zx, p) and arbitrary r, and that it vanish only when r = 
kp where k is a non-negative constant. With this understood let x = 2(s), 
0 < s S %, bea representation of a curve 6 in terms of length. A pair [2(s), 
z(s)| at which z(s) # O will be termed an element tangent to B. A set of ele- 
ments tangent to 6 will be said to include almost all such elements if the corre- 
sponding values of s include almost all values of s on the interval [0, so]. With 
this understood Theorem 4.2 of Morse and Ewing [1] takes the following form. 

THEOREM 6.0. Jf f is strongly convex at almost all elements tangent to 8 and is 
positive semi-normal at all points of B, then a converges in L-length to B if a con- 
verges in J-length to B. 


4 The theorem holds even when L(@) = 0, as one shows by a trivial modification of the 


proof. 


t=' 
Le 
hooc 
defo 
time 
(6.1) 
TI 
[-ne 
and | 
for a 
(6.2) 
wher 
Lt 
and 
Tc 

| 
(6.3) 
That 
the f 
see t 
conv 
It 
(6.4) 
As I 
to 

The 
Cc 
into. 
from 
Th 
such 


ti- 


— 
~ 


he 


VARIATIONAL THEORY IN THE LARGE. II 363 


We shall make use of “variations” n(t) of class C’ for 0 S ¢ S 1, vanishing for 
t= Oandt= 1. 

Let 6 be a fixed curve of positive length and let a be a curve on an L-neighbor- 
hood of B. Let x(t) be the reduced representation of a. We shall consider a 
deformation A(n, 8) of an L-neighborhood of 8 in which a is replaced at the 
time 7 by a curve ¥(a, 7) on L with the representation 


(6.1) za(t) = x(t) + rn(t) @Os78€e). 


Recall that | #(¢)| = L(a) for almost all ¢. 

The neighborhood N and the value of e. We shall restrict a@ to so small an 
[-neighborhood N of 8 that L(a) is bounded and bounded from 0. We take e 
and N so small that for a on N and r on (0, e], the image curves 2, are on S, and 
for almost all ¢ 


(6.2) 


where p and o are positive constants. 

Lemma 6.1. The deformation W(a, 7) satisfies the conditions (a) and (b) of §1 
and is accordingly admissible. 

To establish (a) suppose that a converges in length to a; and 7 converges to 
71. We have 


(6.3) | — S |z@ — +] || nO}. 


That x(t) converges uniformly to 2;(¢) as a converges in length to a follows from 
the fact that ¢ is the reduced length on a and a respectively. From (6.3) we 
see then that ¥(a, 7) converges in the sense of Fréchet to y, = (a, 71), as a 
converges in length to a, and 7 converges to 7; . 

It remains to show that ¥ converges in length toy. Observe that 


1 


| L(y) — Lin) | = 
(6.4) 


As noted in §3 the integral of | ¢ — 4% | converges to 0 as a converges in length 
to a. Relation (6.4) then shows that y converges in length to y; as stated. 
The deformation ¥(a, 7) thus satisfies (a). 

Condition (b) of §1 is satisfied if ¥(a, 7) maps the interval for 7 continuously 
into M, uniformly with respect to (a, 7). That condition (b) is satisfied follows 
from the form of (6.1). 

The first variation of J. Relation (6.2) holds for a on N and 7 on [0, 3]. For 
such an @ and 7 set 


(6.5) [ ad = wla, 7). 


WSs 
he 
er 
L 
in 
ty 
ty 
nt 
K, 
on 

re 
0- 

8 
al 

| | 

a 

0. 
r, 


364 GEORGE EWING AND MARSTON MORSE 
Granting the possibility of differentiating under the integral sign we have 


(6.6) w,(a, T) { 2a) + filza dt. 


To justify this differentiation one forms the difference quotient Q from the inte- 
grand in (6.5), assuming values 7 and + + Ar. Since f is of class C’ and (6.2) 
holds | Q | is bounded for almost all ¢ and for a sequence of values of Ar con- 
verging to0. As Az tends to 0, Q converges to the integrand of (6.6) for almost 
all ¢. It follows from the Lebesgue integration theorem that (6.6) holds as 
stated. 

For the remainder of this section we add the hypothesis that f be of class C”’ for 
r ~ 0. 

We refer to the constants p and o of (6.2) and state the following lemma. 

Lemma 6.2. For x on S and | p | and | q | on the closed interval {c, p| 


where H and K are positive constants. 
Relations (6.7) and (6.8) hold when p = q regardless of the choice of H and K. 
To establish (6.7) note that 


| Pp) f.(x, q) | 
(6.9) (p #4) 
is bounded for the variables admitted in the lemma if one excludes a neighbor- 
hood of the set of pairs (p, q) in which p = gq. The set of pairs (p, p) for which 
| p | is on the closed interval [c, p| form a closed set JT. Neighboring each pair of 
T but excluding pairs on T the quotient (6.9) is bounded, since f is of class C”. 
It follows that the quotient (6.9) has a bound H for the variables admitted. 
Hence (6.7) holds. The proof of (6.8) is similar. 

The following lemma is essential. 

Lemma 6.3. The function w,(a, 7) is continuous in (a, rt) on the domain for 
which (6.2) holds, that is for a on N and 7 on (0, e]. 

We begin by proving (i) and (ii). 

(i) The function w,(a, 7) is continuous in + uniformly for a on N. 

The partial derivatives f,(v, r) and f,(x, r) appearing in (6.6) are uniformly 
continuous in x and 7, for x on S and |r| on the interval [c, p]. It follows that 
the integrand of (6.6) is a continuous function of 7, uniformly for « on N and for 
almost allton [0,1]. Statement (i) follows. 

(ii) For each r on (0, e], w-(a, 7) is a continuous function of aon N. 

To establish (ii) let a and a; be points on N and set 


u(t) = zat), v(t) = a, (0. 


5 Our results could be obtained with suitable changes in proof if f, and f, were merely 
subject to appropriate Lipschitz conditions. 


From 


(6.10 


Note 

(6.11 

(6.7) | p) — )| H|p — 
the r 

(6.10 

the s 

and 1 

(6.10 

Th 

Le 

intro 

J at 

missi 

poses 

upon 

to pr 

condi 

absol 

Wi 

first 

lowin 

impli 
| Th 


VARIATIONAL THEORY IN THE LARGE. II 365 
From the difference 


we(ay 7) — wear 7) = [fel tt) fale, at 


+ [fe(v, %) — fe(v, at 


(6.10) 
+ Ulu, — feo, at 

Note that 

(6.11) u—v = x(t) — x(t). 


This difference tends uniformly to 0 as a converges in length to aq, since ¢ is 
the reduced length on a and a. Moreover | #| and | a! are on the interval 
[c, p] of (6.2) for almost all ¢. It follows that the first and third integrals in 
(6.10) tend to 0 as a converges in length to a,;. Upon using (6.7) we see that 
the second integral in (6.10) is at most 


1 1 


and this again tends to 0 as a converges in length to a. The last integral in 
(6.10) similarly tends to 0. Statement (ii) follows. 

The lemma is an immediate consequence of (i) and (ii). 

Let y(t) be a reduced representation of 8. Then 


w,(B, 0) = [ [fe(y, + LY, at = I(n), 


introducing J(n). Regarded as a function of , I(m) is the “first variation’’ of 
Jat 8. We say that J has a “null first variation at 8” if J(m) = 0 for all ad- 
missible variations 7. We have required that 7 be of class C’. For our pur- 
poses it would be immaterial if we had taken 7n(¢) absolutely continuous. For 
upon using the approximation theorems of the Lebesgue theory it is not difficult 
to prove that under the conditions 7(0) = (1) = 0, a necessary and sufficient 
condition that J(7) = 0 for all 7 of class C’ is that I(m) = 0 for all » which are 
absolutely continuous. 

With this understood we term a rectifiable curve § an extremal if J has a null 
first variation at 8. The fundamental theorem of this section is then the fol- 
lowing. 

THEOREM 6.1. Jf 8 is homotopically critical and if convergence in J-length to g 
implies convergence in L-length to B, then B is an extremal. 

The theorem will be established in the following equivalent form. If the first 


ite- 
on- 
ost 
for 
K, 
q) 
or- 
ich 
of 
for 
ly 
at 
or 
ly 


366 GEORGE EWING AND MARSTON MORSE 


variation [(n) of J at 8 is negative for some admissible variation 7, and J con- 
vergence in J-length to 8 implies convergence in L-length to B then 8 is a 
homotopically ordinary point of J. We shall show that J is homotopically 
ordinary at 8 by exhibiting a proper J-deformation of some L-neighborhood of 
8. In this connection the definition of homotopically ordinary requires a J- 
neighborhood but under the hypotheses of the theorem every L-neighborhood 
of 8 contains a J-neighborhood of 8 so that an L-neighborhood may be used 
instead. See Morse and Ewing [1], Theorem 4.2 for conditions under which 
convergence in J-length implies convergence in length. 


Corresponding to the given 7 we set up the deformation A(n, 8). Then 


w,(8, 0) < 0. By virtue of its continuity, w,(a, 7) < const. < 0, for a ona 
sufficiently small L-neighborhood N of 8 and for 7 on a sufficiently small interval 
(0, e]. For0 < 7 S e, A(q, B) is thus a proper J-deformation of the L-neighbor- 
hood N of 8. Hence 8 is homotopically ordinary and the proof of the theorem 
is complete. 


7. The Riemannian manifold > 


We turn now to a Riemannian manifold 2 with a metric defined by a positive 
definite quadratic form 


gi (i,j 1, 2, r). 


Cf. Morse [1], p. 107. We suppose the functions g;;(x) are of class C’’ in terms 
of the local coordinates (x) and that the transformations z' = 2‘(x) from local 
coordinates (x) to local coordinates (z) are of class C’’’.. The manifold 2 will be 
assumed compact. 

Given two points p and q on = there exists a path of least length joining p to q. 
This least length will be termed the distance 6(p, q). Given two rectifiable 
sensed curves a and 8 on > the Fréchet distance as between a and 6 on & will 
be defined in the usual way using 6(p, qg). The length of a curve a on = will be 
denoted by &(a) using a script £. The space of curves on 2 with the distances 
aBy will be denoted by script 91, and the space of curves on with the distance 


| aBs| = aBy + | L(a) — L(8)| 


will be denoted by &. 

Let S be a convex region on a Cartesian n-space in which the variables (x) 
are admissible coordinates for 2. A curve a on S and the image on 2 will be 
denoted by the same symbol. Let a and 8 be curves on S and & and let a@s 
denote the Fréchet distance on S previously denoted by a8. It is clear that for 
a fixed B, aBs tends to 0 if and only if aBs tends to 0. 

If aBz tends to 0 and L(a) tends to L(8) we say that a converges in L-length 
to B. If aison S the function (a) is represented by an integral of the form 


i) dt 


for a 
To 
inter’ 


with 

[1]. 
L-ler 
Ar 
integ 
missi 
prob! 
lemn 
Le 

(7.1) 
be ar 
with 

The 
In th 
Le 
To e 
gener 
termi 
(7.3) 
Wes 
small 
(0). I 
non-v 
format 


VARIATIONAL THEORY IN THE LARGE. II 367 


with f(x, r) satisfying all the conditions imposed on f(x, r) in Morse and Ewing 
[1]. Hence for 8 on S, a converges in £-length to 8 if and only if a converges in 
L-length to 8. Here L represents ordinary length on S. 

An immediate technical problem is that of deducing various properties of an 
integral J along a curve of = from properties of J along subares, each in an ad- 
missible coordinate system. The problem of upper-reducibility of J is such a 
problem. The principal difficulty will be met with the aid of the wnnieinen 
lemmas. 

Let 


7.1) ¢'(u) (i 1, n) 


be an admissible transformation of coordinates neighboring a point (wu) = (uo) 
with image (v) = (vw). Suppose ¢'(w) has the form 
i i i i d¢' 
7 | (u)=(ug) 
The remainder 7‘(u) is of class C’” neighboring (uw) since ¢'(u) is of class C’’”" 
In the space (w) let the solid n-sphere with radius a and center at (wu) be denoted 
by oa. Our first lemma is as follows. 
LemMa 7.1. Corresponding to the transformation T given by (7.1) there exists an 


admissible transformation T,:v' = y'(u), of a neighbohrood a3, of (uo) such that 
7.2) v'(u) = v + aj(w’ — ud) [(u) on o,] 
v'(u) = ¢'(u) [(u) ON 03, — 


for a sufficiently small p. 
To establish this lemma let h(t) be a function of ¢ of class C’” for ¢ on the 


interval 0 < ¢ < 1 with 
h(t) =0 (0 St < 1/3) 
ht) =1 (2/3 St <1). 
To continue we simplify the proof by supposing that (uw) = (v%) = (0). No 


generality is lost thereby. Corresponding to a positive constant p as yet unde- 
termined consider the transformation 


We shall show that this transformation satisfies the lemma if p is sufficiently 
small. Such a transformation satisfies (7.2) formally in the case (uw) = (vo) = 
(0). It remains to show that for a suitable choice of p the transformation has a 
non-vanishing Jacobian and is one to one for (wu) on @3, . 


6 Recall that the Jacobian | a; | does not vanish at (u) = (uo) for an admissible trans- 
formation. 


368 GEORGE EWING AND MARSTON MORSE 


To that end we make the substitution 
(7.4) v= pe’ = px’ ¢= n) 
in (7.3). We note that 
n'(p, 2) = 2) 


where ¢ is of class C’ in its arguments for | x | bounded and p neighboring 0, 
In terms of the variables (z) and (x), (7.3) takes the form 


For p = 0, (7.5) reduces to a non-singular collineation. If we restrict | x | by 
the condition | x | < 3, then for p sufficiently small the relation (7.5) is one to 
one and possesses a non-vanishing Jacobian. Hence for such a p ¥ 0, (7.3) 
defines a one to one’ transformation with a non-vanishing Jacobian. The proof 
of the lemma is complete. 

A region S over which a system of admissible coordinates (uw) of = range will 
be used to designate this system. We shall make use of the following lemma. 

Lemma 7.2. Let S,and S2 be two admissible coordinate systems such that the 
points of = represented by both S, and Se include a neighborhood of a point p. 
There then exists an admissible coordinate system S which represents the same 
points of = as does S2 and,which is such that for some neighborhood of p the trans- 
formation from the coordinates of S, to those of S is the identity. 

Let (u) and (v) be respectively the coordinates of S. and S,; and suppose that 
(uw) and (v) correspond to p. Suppose further that the transformation 7’ given 
by (7.1) represents the relation between S2 and S, neighboring (uw) and (vp), 
mapping a neighborhood of on onto a neighborhood of on S;. Mak- 
ing use of the transformation 7 of Lemma 7.1 set 


T = [(u) on 


thereby defining T;. We see that T2 is the identity when (wu) is on 03, — oa 
and so can be extended as the identity over S: — o3,. Let S3 be the image of 
S. under JT: so extended, and let a point of S; represent the same point of = 
as does its image on Sz . 

Then 7’ gives the transformation from the coordinates, say (w), of S; to those 
of S,, at least neighboring (w). But on a sufficiently small neighborhood of 
(uo), 7; is affine, and as such can be extended over the whole of S;. Let S be 
the image of S3 under this affine transformation and let a point of S represent 
the same point of = as does its image on S;. Then the point with coordinates 
(vo) on S corresponds to the point (v) on S;, and the transformation from a 
point of S neighboring (vo) to a point of S; neighboring (vo) and representing the 
same point of > is the identity. 


7 This is not an ordinary inference from a non-vanishing Jacobian, but rather an in- 
ference in the large over all of o3,. It follows from the fact that for p = 0 the transforma- 
tion is one to one over all of a3, . 


a cont 
of coo) 
on 2 0 
x(t), 
such 
system 


That t 
sentati 
class C 
of §5. 
We: 
this is 
condit: 
Here 


where 
The 
if thers 


(8.1) 


for all 
The 


Ind 
the re 
earlier 
definec 


Let 
coordit 
As aps 
lower 
coordit 

In p 
fis con 


AN 


© 


VARIATIONAL THEORY IN THE LARGE. II 369 


8. The integral J on > 


With each coordinate system (x) we have given a function f(x, r) of (x) and 
a contravariant vector (r), with f invariant with respect to admissible changes 
of coordinates. Let a be a rectifiable curve given as the continuous image p(t) 
on > of a t-interval & < ¢ S t, and with an absolutely continuous representation 
a(t), i = 1, --+, ”, in each coordinate system (x) in which @ enters. In each 
such system set f(x, ) = g(t). The value g(t) is independent of the coordinate 
system used to define g(t). We set 


Ha) = a. 


That this integral exists as a Lebesgue integral, and is independent of the repre- 
sentations of a used to define it, follows from our assumption that f(z, r) is of 
class C”’ in (x) and (r) for r # 0, and that f(z, r) is homogeneous in r in the sense 
of §5. These conditions will be assumed henceforth without explicit mention. 

We also assume that f(x, 7) is convex in (7) in each coordinate system. That 
this is an invariant condition is seen from the fact that it is equivalent to the 
condition that the invariant Weierstrass E-function E(x, r, ¢) be non-negative. 
Here 


E(a, r, = f(x, 0) — o'f,i(a, r) [(r) (0)] 


where (r) and (c) are contravariant vectors. 
The integrand f(z, 7) will be said to be positive semi-normal at a point (x) = (c) 
if there exists a covariant vector (b) defined at (c) such that 


(8.1) f(c, r) > byr' [(r) ¥ 0] 


for all non-null contravariant vectors (r) defined at (c). 
The integral J(a) on ¥ will be said to satisfy the condition of Hahn if the 


Llengths of curves a on = on which J(a@) S 0 are bounded. 


In deriving the consequences of these conditions we shall have occasion to use 
the reduced y-parameter of Morse, 0 <= uw S 1, (to which we have referred 
earlier), along curves a. In defining this parameter one uses the metric of = 
defined by the distance 6(p, q). With this understood let a and 8 be curves on 
>. A-subare a’ of a will be said to correspond to that subare 8’ of 8 on which 
the y-parameter ranges through the same values. As tends to 0, a’ Bs like- 
wise tends to 0. 

Lower semi-continuity of J(a). Suppose for a fixed 8 that a8: tends to 0. 
Let 8 be divided into m successive arcs B?,---, B” such that 8 lies in a 
coordinate system S;. Let a” be the subare of a “corresponding” to 8”. 
As aBs tends to 0, a‘ tends to 0 for each 7, and conditions sufficient for the 
lower semi-continuity can be read off from the corresponding conditions in 
coordinate systems S. 

In particular J(«) will be lower semi-continuous on any subset &* of L provided 
f is convex in (r). 


370 GEORGE EWING AND MARSTON MORSE 


To show that the conditions sufficient that &(a) be bounded with J(a) are 
nominally the same as in a particular coordinate system we must revise the defi- 
nition of a ‘‘pseudo-limiting” curve used in Morse and Ewing [1] in so far as that 
definition depends on the use of one coordinate system. 

Without loss of generality we can suppose that each are of unit £&-length on 
lies entirely in one coordinate system. This is a consequence of the fact that = 
can be covered by a finite number of such systems while an arc of unit £-length 
ean be made arbitrarily small relative to distances in the coordinate systems by 
multiplying the form giving ds by a suitable positive constant. 

A sequence of point representation p,,(t) of curves on 2 will be said to converge 
uniformly to a representation p(t), t < ¢ < t, if p»(t) converges to p(t) for each 
t, uniformly with respect to ¢ on [fo , 4]. We term p(t) absolutely continuous 
if the functions 2'(¢) representing p(t) in each coordinate system into which 
p(t) enters are absolutely continuous. 

We begin with the following lemma. 

Lemma 8.1. Let p(t), m = 1, 2,--- , be a sequence of arcs on = on each of 
which t is the &-length with t S t S t + 1. There then exists a subsequence 
qu(t), w = 1, 2,--- , of the sequence pm(t) which converges uniformly to an abso- 
lutely continuous point function p(t). 

The arcs p»(t) have at least one cluster are 8 to which a subsequence w con- 
verges in the sense of Fréchet. The are 8 lies in some coordinate system S 
since $1. In Slet’xi(t),t St <t%+1,v = 1,2, --- , be representations 
of those arcs of w which lie in S. In accordance with Ascoli’s theorem there 
exists a subsequence w, of the sequence x for which the corresponding functions 
x, converge uniformly on [f , f& + 1] to absolutely continuous functions X’ x) 


Upon taking as the point representation of the curve (t),i = 1, 
the lemma follows. 
A pseudo-limiting curve. Let pn(t),m = 1, 2, --- , be a sequence ap of repre- 


sentations of ares on = where ¢ is the &-length, 0 < ¢ S tm, and tm becomes in- 


finite with m. The sequence go will be said to have a pseudo-limit p(t), if p(t)’ 


is defined and absolutely continuous for 0 S ¢t < © and if p,,(¢) converges 
uniformly to p(t) on each finite interval for ¢. It is naturally not required or 
expected that ¢ be the £-length along p(t). 

Lemma 8.2. At least one subsequence of oo possesses a pseudo-limit P(t). 

In accordance with Lemma 8.1 there exists a subsequence o; of oo for which 
the corresponding subsequence of point functions p,,(¢) converges uniformly for 
0 < ¢t S 1 to an absolutely continuous point function P(t). We proceed in- 
ductively assuming that om; is a well defined subsequence of gm 2, m = 3, 
4,---. Using Lemma 8.1 again we infer the existence of a subsequence of 
om-1 Such that the corresponding subsequence of functions p,,(¢) converges uni- 
formly for m — 1 S t S m to an absolutely continuous point function P,,(t). 
We define P(t) for0 < t < ~ by setting 


P(t) = Pn(t) (m-1St<m) 


notior 

each p 

difficu 
To 


sequel! 


systen 


thereb 


Let 
is cles 
isasi 
to P( 
Bo 
if for 
the le 
pseud 
TH 
systen 
compa 
One 
syster 
comps 
lower 
The 
deforr 
of coo 
chosen 
on NV 
of B. 
that 
ai, t= 
set up 
point a 
D(m, 
. Witl 
M by 
in the 
curve B 
(m = 1, 2, +++). Conc 


VARIATIONAL THEORY IN THE LARGE. II 371 


Let o be a subsequence gm(t) of oo in which g(t) is the m™ element of om. It 
is clear that the sequence 


Qm(t), Ym+i(t), 


is a subsequence of oo and that as m becomes infinite q(t) converges uniformly 
to P(t) on each finite interval for t. 

Bounded M-compactness of J(a). Wesay that J(a) is boundedly \i-compact 
if for each constant c the set J°is an 9il-compact subset of ON. By virtue of 
the lower semi-continuity of J(a) on sets £", and with the use of the notion of 
pseudo-limiting curve one can prove the following theorem. 

THEOREM 8.1. Jf f is convex in (r), positive semi-normal in each coordinate 
system, and if J satisfies the condition of Hahn on &, then J(a) is boundedly MN- 
compact and & is bounded with J. 

One proves that & is bounded with J as in the case of a single coordinate 
system. See Morse and Ewing [1], Theorems 3.3 and 3.2. The bounded Si- 
compactness of J then follows from the 9il-compactness of &* for each « and the 
lower semi-continuity of J(a) on &. 

The proof of the upper-reducibility of J(a@) involves new difficulties. The 
deformations D(m, 8) as defined in a coordinate system S, depend upon the 
notion of straightness in the system S. In deforming a curve a under D(m, 8) 
each point (2) on a is deformed along a straight line to its final destination. The 
difficulty is met with the aid of Lemma 7.2. 

To prove the upper-reducibility of J(a) at a curve 8 we break 8 up into a 
sequence of ares B;,7 = 1, --- , v, such that 8; lies entirely in some coordinate 
system S“. Let p; be the final end point of 8;. Then p; lies both in S‘” and 
We apply Lemma 7.2 successively to pi, po, We are able 
thereby to affirm the following. There exists a sequence 


Si, Se, [Si = 


of coordinate systems such that S; contains 6; and such that for a suitably 
chosen spherical neighborhood N; of p; , i = 2, --- , v — 1, the transformation 
on NV; from the coordinates of S; to those of S;4; is the identity. 

_ We can now define a deformation 6(a, 7) of curves a on an Mi-neighborhood 
of 8. Let a; be the are of a@ “corresponding” to 8;. We suppose af: so small 
that a; lies in S;, 7 = 1,---,¥v, and that the final end point gq; of 
a;,i = 2,---,»—1lieson N;. We subject a; to the deformation D(m, 8;) 
setup in S;. The deformations D(m, 8;) and D(m, B:4:) replace g; by the same 
point at the time 7. Hence these deformations combine to define a deformation 
D(m, 8) of a in which a is replaced at the time 7 by 6(a, 7). 


. With @(a, 7) so defined Lemma 5.1 holds formally, with L* replaced by £", 
M by ON, and ¢g(a, 7) by O(a, 7). We continue, except for notation, exactly as 


in the proof of Theorem 5.1. We then obtain the following theorem. 
THEOREM 8.2. If Lis bounded with J then J is wpper-reducible at each rectifiable 
curve B on >. 
Conditions sufficient that & be bounded with J are given in Theorem 8.1. 


‘ 
t 
> 
h 
y 
Je 
h 
is 
h 
of 
ce 
n- 
S 
ns 
re 
ns 
t). 
nh 
n- 
(t) 
es 
or 
ch 
or 
in- 
3, 
of 
ni- 
t). 
m) 


372 GEORGE EWING AND MARSTON MORSE 


9. The existence of homotopic critical points of J 


The principal conditions on a function J(a) defined on a metric space 9M in 
order that the critical point theory apply are that each set J° be compact and J 
be upper-reducible. We have seen that both these conditions are satisfied if 
for (r) ¥ 0, f is of class C”’, if f is convex in (r) and positive semi-normal in each 
coordinate system, and if J satisfies the condition of Hahn. We can now give 
simple conditions implying homotopic critical points. 

We shall term the least upper bound of J on an arbitrary subset EF of OM the 
J-height of E. 

Our chains and cycles shall be defined on £ using &-continuity. They shall be 
finite singular chains and cycles, taken mod 2. See Morse [1], p. 146. The 
extension to the case of a field of coefficients is immediate if desired. We shall 
admit relative k-cycles u in which the modulus is always a set J° in which ¢ is 
less than the J-height of uw. 

‘The principal theorem on the existence of homotopic critical points is as 
follows. We assume that each set J° is M-compact.® 

THEOREM 9.1. Let K be a homology class of k-cycles mod J“, non-bounding 
mod J* on & and let c be the greatest lower bound of J-heights of cycles of K. If 
c > a, and if J is upper-reducible at each point of J° there exists a homotopic crit- 
ical point at which J = c. 

The theorem is a consequence of Corollary 3.3 of Morse [4]. 

If each set J° is M-compact then J is bounded below on 9. For a sequence 
of points a for which J became negatively infinite could converge to no point 
B of Ml. For J(8) would necessarily be less than every constant c. 

If the constant a of Theorem 9.1 is less than every value of J the homology 
class K of Theorem 9.1 becomes a class K of absolute cycles corresponding to 
which the theorem affirms the existence of a homotopic critical point at which 
J =c. If K is the homology class of 0-cycles this specialization gives a critical 
point affording an absolute minimum to J on 2. Theorem 9.1 may be em- 
ployed in a variety of ways to obtain and classify critical points. Homotopic 
critical points lead to “extremals” as will be shown in §10. 


10. The existence of extremals 


Let 8 be a curve of Oi. We shall term 8 an extremal if each subarc 6* of 8 
lying in a coordinate system S is an extremal of J in the coordinate system S. 

We wish to use the definition of a homotopic critical point, replacing the 
L-neighborhood in this definition by a J-neighborhood. This is permissible if 
each £-neighborhood of 8 contains a J-neighborhood of 8. To give conditions 
for this we must extend certain definitions to 2. 

If (x) is a point on B in a coordinate system S, a pair (x, 7) in which (7) is a 
non-null contravariant vector at (x) will be termed an element tangent to 6 
at (x) if (x, r) is an element tangent to 8 at (x) in the ordinary sense in S. The 


8 Sufficient conditions that J* be 9\(-compact are given in Theorem 8.1. 


When 
so tha 
to B, . 
Sinc 

is com 
The 
I(n), f 
some ¢ 
ordina: 
in L-le 
exists 
Let 

a “corr 
will not 
tion D 
tion of 


Lneigt 


ordinar 


phra 
at al 
Tl 
null 
inva 
with 
W 
preté 
tang 
£-lex 
Wi 
TH 
£-len 
We 
(a) 
then ¢ 
L-len, 
Su; 
that / 
pr ane 
By tha 
line si 
exten 


VARIATIONAL THEORY IN THE LARGE. II 373 


phrase almost all elements tangent to 8 is used for a set of elements tangent to 8 
at almost all points s of 8, where s is the arc length on 8. 

The condition that f be strongly convex at an element (x, p), where p is a non- 
null contravariant vector, at a point (x) in a coordinate system S, is that the 
invariant E-function E(x, p, 7) be positive for the given (x, p), except when r = kp 
with k = 0. 

With these definitions we can reaffirm Theorem 6.0 for = with the new inter- 
pretation of its terms. Thus, if f is strongly convex at almost all elements 
tangent to 8 and is positive semi-normal at all points of 8, then a converges in 
£-length to 8, if a converges in J-length to 8. 

We are ready for the homotopy theorem. 

THEOREM 10.1. Jf convergence in J-length to a curve 8 implies convergence in 
L-length to B then B is an extremal whenever homotopically critical. 

We begin with a proof of statement (a). 

(a). When convergence in J-length to 8 implies convergence in &-length to B, 
then convergence in J-length to a subarc 8, of 8 implies convergence in &-length and 
L-length to B; . 

Suppose £ is a sequence of the three arcs {> , 8; , 62 , admitting the possibility 
that Bo or B2 reduce to points. Let p; and q be the end points of 6; , and suppose 
~; and q; be in convex coordinate systems S and 7. Let a; be an are so near 
8; that its end points p and q be on S and T respectively. Let pip and qq: be 
line segments on S and 7’ respectively joining p; to p andqtoq. Let a be 
extended by forming the curve 


a = Bo prpar qqi 


When a; converges in J-length to 6; it is clear that a converges in J-length to 8, 
so that a then converges in £-length to 8, and finally a, converges in £-length 
to 

Since convergence in £-length and L-length are equivalent the proof of (a) 
is complete. 

Theorem 10.1 will be established in its equivalent form: if the first variation 
I(n), formed for a subare 6* of B in some coordinate system S, is negative for 
some admissible 7, and if the hypothesis of (a) holds then 8 is homotopically 
ordinary. By virtue of (a), convergence in J-length to 8* implies convergence 
in L-length to 6*. In accordance with the results of §6 there accordingly 
exists a proper J-deformation A(n, 8) of an L-neighborhood N* of 6*. 

Let a be a curve on so small an £-neighborhood N of 8 that the subarc a* of 
a “corresponding” to 8* ison N*. The point deformation used to define A(n, 8*) 
will not move the end points of a* and so can be regarded as defining a deforma- 
tion D of a in which a* alone varies. This deformation D is a proper J-deforma- 
tion of the &-neighborhood N of 8. Under the hypotheses of the theorem every 


_ &neighborhood of 8 contains a J-neighborhood of 8. That 8 is homotopically 


ordinary now follows from the definition involved. 


in 
J 
if 
th 
ve 
ne 
be 
he 
ull 
is 
as 
ng 
If 
1ce 
int 
EY 
to 
ich 
cal 
m- 
pic 
yf 
the 
e if 
ons 
is a 
o B 
The 


374 GEORGE EWING AND MARSTON MORSE 


11. Résumé 


We shall not attempt in any sense to summarize our results under the weakest 
hypotheses under which they are proved. A broad résumé will however be 
useful. 

In order that J° be compact for each constant ¢ and J be upper-reducible it is 
sufficient that f be of class C’ for (r) # (0), homogeneous in (r) in the usual 
sense, convex in (7), and positive semi-normal in each coordinate system, and that 
J satisfy the condition of Hahn. Then the existence of homology classes of 
non-bounding k-cycles or relative k-cycles implies the existence of homotopic 
critical points. (See Theorem 9.1.) 

If 8 is a homotopic critical point, the conditions sufficient that 6 be an “ex- 
tremal” are somewhat stronger but the additional condition need be satisfied 
only for points (x) near 6. The additional sufficient conditions are that f be 
strongly convex at almost all elements tangent to 6 and be of class C”’ for (z) 
neighboring 8 and for (r) # (0). 

If f is positive for every (r) # (0) the condition of Hahn is fulfilled. If f is 
positive and positive regular in the classical sense, then all of the preceding 
conditions are satisfied. Thus the classical theory in the large becomes a special 
case of the preceding, at least in its analytical as distinguished from its topo- 
logical foundations. 


INSTITUTE FOR ADVANCED Srupy, 
UNIVERSITY OF MissouRI. 


BIBLIOGRAPHY 


McSuange, E. J. 
1. Semicontinuity of integrals in the calculus of variations, Duke Mathematical Journal, 


2 (1936), 597-616. 
2. Curve-space topologies associated with variational problems, Annali della R. Scuola 
Normale Superiore di Pisa, Serie II, 9 (1940), 45-60. 
Morse, M. 
1. The calculus of variations in the large, American Mathematical Society Colloquium 
Publications XVIII (1934). 
2. A special parameterization of curves, Bulletin of the American Mathematical Society 
42 (1936), 915-22. 
3. Functional topology and abstract variational theory, Mémorial des Sciences Mathé- 
matiques, XCII, Gauthier-Villars, Paris (1939). 
4. Functional topology, Bulletin of the American Mathematical Society, 49 (1943), pp. 
144-149. 
Morsg, M. G. 


1. The variational theory in the large including the non-regular case—first paper, Annals ' 


of Mathematics (1943). 


Morsg, M. anp Tompkins, C. 
1. The existence of minimal surfaces of general critical types, Annals of Mathematics 40 


(1939), 443-72. 


TI 
and 
The 
fune’ 
whic 
the 1 
in th 
analy 
grow 
calcu 
theor 
funct 
funct 

To 


very 


(a) 


where 
whose 
We 
Let: 
1 — « 
tion” 


(8) 


When 
equatis 

Now 
y(z, b) 
and ar 
theoret 


sequen 
the seq 


1 Mor 


ANNA 
Vol. 
AN: 
| 


als 


3 40 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


ANALYTIC SOLUTIONS OF NON-LINEAR DIFFERENCE EQUATIONS 


By Watrer Stropt 
(Received February 17, 1943) 


Introduction 


This paper treats a broad class of ordinary difference equations, both linear 
and non-linear, with arbitrarily many not necessarily commensurate spans. 
The coefficients in the equation are rational functions, or certain more general 
functions of x. Solutions are obtained which are analytic in a half-plane and 
which satisfy a condition of restricted growth at infinity. For each equation 
the number of such solutions obtained is precisely the degree of the equation 
in the unknown, and it is proved that there are no solutions, other than these, 
analytic in such a half-plane and satisfying such a condition of restricted 
grow+h. The solutions are found by a new technique constructed from a con- 
cpt of approximating q-difference equations, from a procedure analogous to the 
calculus of limits as applied to algebraic functions, from the classic compactness 
theorem for bounded families of analytic functions, and from a theory of special 
functions (“almost constant” functions) which are generalizations of rational 
functions bounded at infinity. 

To sketch the method more precisely, let us consider its application to the 
very special case of a difference equation of the form 


(a) y(x + wi)y(x + w) + y(x + ws)y(x + ws) = r(x), 


where w; , we , 3, #4 are non-negative numbers, and r(x) is a rational function 
whose limit at infinity is finite and different from zero. 
We shall, without essential loss of generality in this case, assume that w, = 0. 
Letting b be any positive number greater than every w, , and defining g, = 
1 — w,b", (s = 1, 2, 3, 4), we consider the “approximating q-difference equa- 
tion” 


(8) + + wr) + + ws)y(Qax + ox) = r(x). 


When b tends to infinity, this equation tends formally to the given difference 
equation, since every g, tends to unity. 

Now if for every sufficiently large value of b there is an analytic solution 
y(z, b) of (8), and if these solutions have a region of analyticity D in common, 
and are bounded in D by a bound M independent of b, then the compactness 
theorem’ for bounded families of analytic functions shows that there exists a 
sequence {by ; k = 1, 2, ---} of positive numbers tending to infinity for which 
the sequence {y(x, bx); k = 1, 2, ---} tends to a limit function, uniformly in 


1 Montel, ‘“‘Legons sur les familles normales,’’ Paris 1927, section 10. 
375 


i 
yf 
d 
ye 
is 
1g 
al 
O- 
al, 
la 
um 
sty 
hé- 
pp. 


376 WALTER STRODT 


every closed bounded subset of D. If D is sufficiently extensive, such a limit 
function evidently is an analytic solution of (a). 

We consider the question of finding such y(z, b), D, and M. 

The point x = b plays an important role in this question, since if y(x) = 
— b)’, then + w.) = — b)’, and (8) becomes 


+ [Domo — — b)’] = r(z). 
It is the simplicity of this equation which makes the introduction of equation 


(8) advantageous. 
Letting r(x) = }>%orj(x — b)’, we see that equation (y) determines the ¢; 


recursively as follows: 
2c5 = 1 
coe (gi + + 93 + Gi) = 
2 j— 4. r*) 


— coc + gg + 


? (j = 1, 2, ---). 


Since rp approaches a finite non-zero limit as b becomes infinite, there exist 
positive numbers MM, , M2 independent of b such that Mi < | co | < Mz for all 
large values of b. Moreover, 0 < q, S 1, (s = 1, 2,3, 4), andq: = 1. Hence, 
if numbers C;, (j = 1, 2, ---), are defined recursively by the equations 


M,C; = | r; | + + 4C2C 5-2 + 


the inequalities | c;| < C; will be valid, (j = 1, 2,---). Now if C(z, 6) = 
C(x — b)’, and R(x, b) = | | (x — b)’, C(x, b) will satisfy the al- 
gebraic equation 

M,C(z, b) = R(z, b) + 2C°*(z, b). 


Thus C(x, b) is analytic wherever R(z, b) is analytic, except possibly at values 
of x for which Mj = 8R(z, b). If Do is a number exceeding the modulus of 
every pole of r(x), R(x, b) is analytic when | x — b| <b — Do. Moreover, for 
every positive ¢ there is a number D(e) independent of b such that R(x, 6) is 
analytic and satisfies the inequality | R(x, b)| < ¢ throughout the region 
|2 — b| <b — D(e). (This eproperty of r(x) is the characteristic property 
of the class of functions designated in this paper as ‘‘almost constant”’ functions’. 

It is possessed by all rational functions which are bounded at infinity’, and by 
many other functions, including for example the solutions of (a) which we shall 


obtain.) 


2 Definition 2, below. 
3 By Lemma VII of the appendix. 


A stud: 
f(x) bo 
plane, , 
of the t 
The det 


I 
8R( 
regi 
R(x. 
| arbi 
| suffi 
| C(: 
p> 
inclu 
He 
the | 
This 
R(x) 
boun 
choice 
= analy 
It 
right 
| analy 
of a ¢ 
two d 
the pr 
tion, 
q-diffe 

(6) 


NON-LINEAR DIFFERENCE EQUATIONS 377 


If now ¢ is any positive number smaller than (J/{)/8, the equation Mj = 
8R(x, b) is false at every point of the region |x — b| < b — D(e), and con- 
sequently C(x, b) is‘analytie in the region |x — b| < b — D(e). Since 
le;| $ C;, G = 1, 2, ---), it follows that — b)’ is analytic in the 
region |x — b| <b — D(e@). By choosing & sufficiently small we can make 
R(x, b) arbitrarily small in |x — b| < 6b — D(e), and hence make C(z, 6) 
arbitrarily near a root X of the equation M,X = 2X*. Thus, if & is fixed as a 
sufficiently small number, there will be a number M, independent of b such that 
|C(z, b)| < Mo throughout |x — b| < b — D(e@). This implies that 
| — b)?| < My throughout |x — b| < b — D(e). Thus we shall 
obtain the desired y(x, b), D, and M by taking for y(x, b) the series 
io c;(« — b)’, taking for D any bounded region which with its boundary is 
included in the half-plane R(x) > D(e), and taking for M the number My) + M2. 

Hence at least one solution y(x) of equation (a), analytic in D, is obtained as 
the limit of a sequence {y(x, b); b = bi, be, ---} of solutions of equations (@). 
This solution of (a) can be continued analytically throughout the half-plane 
R(x) > De) by the use of an increasing sequence of regions D. It will be 
bounded in this half-plane by the bound M. Since there are two distinct 
choices for ¢ , it is easy to prove that at least two distinct solutions of (a), 
analytic and bounded in a right half-plane, can be obtained in this way. 

It can be shown that every solution of (a) which is analytic and bounded in a 
right half-plane can be expressed as the limit, as b become infinite through all 
sufficiently large positive values, of y(x, b), where y(x, b) is a solution of (8), 
analytic at x = b, and where the limit is uniform in every closed bounded subset 
of a certain right half-plane; from this it follows easily that there are at most 
two distinct solutions of (a), analytic and bounded in a right half-plane. For 
the proof that each bounded solution of (@) can be so expressed as a limit func- 
tion, we note that if yo(x) is a solution of (a) then it is also a solution of the 
q-difference equation 


ylque + (ger + + + + = 
where p(x) = r(x) + + o1)yo(qox + we) 

— + wr)yo(x + we) 

+ + ws)yo(gar + ox) 

— + ws)yo(x + ws)]. 


A study of the difference f(q.7 + ws) — f(x + ws), (s = 1, 2, 3, 4), for functions 
f(z) bounded in a right half-plane shows that if yo(x) is bounded in a right half- 
plane, p(x) is in a certain sense so nearly equal to r(x) when b is large that one 
of the two solutions of (8), analytic at x = b, is near the solution yo(x) of (6). 
The detailed proof of this part, even in the simple case under discussion, leans 


| 

t 
2S 
of 
or 
is 
mn 
2 
ill 


378 WALTER STRODT 


heavily upon the lemmas of the appendix to this paper, where a systematic 
study of almost constant functions is made. 

The proof of the general theorems demonstrated below follows closely the 
outline of the proof in this special case, except that a modification of the tech- 
nique of dominant functions is employed; this modification consists in a trans- 
formation of the approximating q-difference equation by a substitution of the 
form y(x) = 2°z(x) before the use of dominant functions in the manner described 
above; the purpose of the modification is the securing of a sharper estimate for 
the radius of convergence of the solutions at « = b of the approximating gq-dif- 
ference equation. 

We now state the general theorems to be proved in this paper. 


Statement of Theorems 


THEOREM 1. Given the difference equation of degree n, 


m 8s(k) 
(1) a,(x) + wes) = ¢(2), 


where 
(2) g(x) is a rational function, and therefore is asymptotically equivalent to cx” for 


some non-zero complex number cand some integer p, (positive, negative, or 
zero), 

(3) a,(x) is a rational function having a finite (perhaps zero) limit a, at infinity, 
(kK = 1,2,--- ,m), 

(4) 0S om Sow Sony, 

(5) s(k) Sn, 

(6) R(a,) = O whenever s(k) = n, 

(7) s(k) = nforall kif p < 0, 

(8) for at least one value of k, the relations s(k) = n, R(ax) > 0, wm = 0 are all 


valid. 
Conclusion: For all sufficiently large positive constants D, M there are in the region 


R(x) > D exactly n analytic solutions y(x) of (1) which satisfy the inequality (9) 
| ya) | <M 

Every solution y(x) of (1) which is analytic in a half-plane R(x) > D > 0 
and there satisfies (9) can be expressed in the following form: 


(10) y(x) = lim Yo(X) 


where yx(x) is a solution of the “approximating q-difference equation” (see Defini- 
tion 1), analytic at x = b, and where the limit is uniform in every closed bounded 
subset of some half-plane R(x) > D’ 2 D. 

DerFIniITION 1. THE APPROXIMATING g-DIFFERENCE Equation. Let b be a 
positive’ number greater than every one of the numbers w., appearing in equation 
(1). Let qxs be defined by the equations 


4 Throughout this paper it is always understood that b is positive. 


(11) 
The 


and 
restri 
in De 
ratiot 
Dern 
which 
varia 
— 
ficient 
for |x 
there 
| Bo || 
someti 
is disp 
We 
THEOR 


(2’) 


where 


(3’) 


and wh 
Conclu. 
of equa 
a half-j 
constan 


| 
(12) 
will 
(1 
(12) 
— 
seen 
Th 
| 


if- 


NON-LINEAR DIFFERENCE EQUATIONS 379 


(11) Qee = 1 — (k = 1,--+, m;s = 1, 2,-+-, 8(k)). 
Then the functional equation 

m 3(k) 
(12) IT + om) = 


will be called the approximating q-difference equation for equation (1). 

(This terminology is motivated by the fact demonstrated in Lemma 1 that 
(12) becomes a q-difference equation if the independent variable is taken as 
xz — b instead of x, and by the fact that when 6 is large q;; is near 1, so that 
equation (12) is in a formal sense an approximation to equation (1). It may be 
seen from this that the “singular point” of (12), from the standpoint of the 
theory of g-difference equations, is the point x = b, which is precisely the point 
at which y,(x) is required to be analytic.) 

The proof of Theorem 1 will be omitted, since in what follows a more general 
theorem will be demonstrated. The condition that a,(r), (k = 1,---, m), 
and g(x)x ” be rational and bounded at infinity will be replaced by the less 
restrictive condition that these functions be “almost constant’’, as explained 
in Definition 2. (It is a consequence of Lemma VII of the appendix that every 
rational function which is bounded at infinity is almost constant.) 
DeriniTion 2. Atmost Constant Functions. Let f be either a function f(x) 
which is analytic in a half-plane R(x) > D > 0, or a function f(x, b) of the two 
variables x, b which for all sufficiently large values of b is analytic in the region 
|a — b| < b — D, for a positive number D independent of b. Then if b is suf- 
ficiently large, f can be expanded in a Taylor’s series 79 8;(x — b)’, convergent 
for|x-—b| <b—D. We shall say that f is almost constant if for every positive ¢ 
there is a positive number D(e), independent of b, such that | >y7-0 | 8;| (x — b)’ — 
| Bo || < € whenever |x — b| < b — D(e). (Note: In what follows, we shall 
sometimes deal with functions of the two variables x, b, of which only the first 
is displayed in the notation.) 

We now state the more general theorem. 

THEOREM 2. Given the difference equation (1), where 
(2’) g(x)a ” is almost constant, and the limit as b becomes infinite of e(b)b ” is c, 
for some real number p and some non-zero complex number c, 
where 
(3’) a,(x) ts almost constant, and the limit as b becomes infinite of a;(b) is a 
finite (perhaps zero) number aj, , (k = 1,2, «++ ,m), 
and where (4), (5), (6), (7), (8) are valid. 
Conclusion: All the conclusions of Theorem 1 are valid, with this new interpretation 
of equation (1). Moreover, if yo(x) is any solution of (1) which ts analytic in 
a half-plane R(x) > D > 0 and there satisfies (9), then yo(x)x”'" is almost 
constant. 


| 

ic 
he 
h- 
\S- 
he 
ed 
or 
for 
or 
ity, 
m), 
m), | 
all 
. 
(9) 
> 0 
fini- 
nied 
be a 
ition 


380 WALTER STRODT 


Proof of Theorem 2 


Part I. PRELIMINARY MODIFICATIONS OF THE APPROXIMATING Q-DIFFERENCE 
EQuaTION 


Lemma 1. Let f(x) be any function of x, analytic at x = b, and suppose that 
(13) f(z) = — by’. 
7=0 

Then f(qist + wis) is analytic at x = b, and 
(14) + one) = — 0)’, +1, = 1, +++, 

2 
PRoor. 

S (Gest + wre) = + Ones — 
= +b — bane — = — dy 


Lemma 2. The substitution’ y(x) = «”!"2z(x) transforms equation (12) into 


m s(k) 
(15) I] + wis) = 
where 
8(k) 
(16) = TT + 1, m), 
and 
(17) r(x) = g(x)x”. 


Lemma 3. Let the definitions ox(x) = oxju’, r(x) = rju’, 2(x) = 
¢ + zju’ be made, where 
(18) u=x—b. 


(The functions a;,(x), r(x) are easily seen to be analytic at x = b, and hence so 
expressible, provided b is sufficiently large; ¢ and the z; on the other hand are 
to be determined.) Then equation (15) becomes the q-difference equation 


(19) > (: + - rw = 0. 


k=l s=1 j=0 


Proof by Lemma 1. 


5 Here and throughout the paper that branch of 2?/" in R(x) > 0 is chosen which is 
positive when z is positive. 


LEM 
(19) 


(20) 


and 


(21) 


where 
in th 
0,1, 


Lem 


for all. 
Proor 


He: 
positive 


|| 
| | 
|| 

as bor 
Proo 
Becat 
it fol 
dks 
The 
l such 
in vie 
and tk 
LEM. 
one neé 
exist p 


a3 


is 


NON-LINEAR DIFFERENCE EQUATIONS 381 


Part II. Format So.vtions OF THE g-DIFFERENCE Equation (19) 


Lemma 4. If the coefficient of u’, (j = 0, 1, 2, ---) in the left-hand member of 
(19) 2s equated to zero, the equations obtained are 


(20) azo To, 
and 


aw (qh + + +++ + haw) 


where P;, (j = 1, 2,---) is a polynomial, with positive integers for coefficients, 
in the indicated arguments, (with i, k = 1, 2,---,m;s = 1,2,---, sk); # = 
0, 1, J) 


Lemma 5. lim ro = c. 
lim aw = 0, if s(k) ¥ n. 
lim aw = ag, if s(k) = n. (All limits as b — @) 


a.) 
s(k)=n s=l 

is bounded below by a positive number independent of j and b, (j = 0, 1, 2, ---). 
Proor. The first assertion follows from (2’), and the fact that r(x) = ¢(x)z°”. 
Because of (7), the second assertion is vacuously true when p S 0; when p > 0, 
it follows at once from (16). The third assertion follows from (16), since 
is > 1 when b > ~, and since (3’) states that a,(b) — a, when b— ~. 

The fourth assertion follows from the fact that because of (8) there is a number 
lsuch that s(l) = n, R(a,) > 0, andw, = 0. From this gn = 1, and therefore, 
in view of (6), we have 

aude) (> a a.) = R(a), 
and this lower bound is independent of j and b. 
Lemma 6. When b is large, equation (20) has n solutions ¢ = ¢“°, (t = 1,2, --- ,n) 
one near each of the n distinct n roots 01, 02, +++ , on of the number c/I, where 
I = }- ay, the summation being over all the values of k for which s(k) = n. There 
exist positive numbers’ M, , Mz such that 


(22) M, | < Me, (¢ = 1,2,---,n), 


for all sufficiently large values of b. 
Proor. By the last part of Lemma 5, I ¥ 0. 


6 Here, and henceforth, the symbols M and D, possibly with subscripts, shall stand for 
positive numbers independent of b. 


382 WALTER STRODT 


By Lemma 5, as b > ~, the initial coefficient in (20) tends to the non-zero 
number J; the constant term tends to the non-zero number c, and all other 
coefficients tend to zero. 

Since the roots of a polynomial are continuous functions of the coefficients, 
Lemma 6 follows. 

Lemma 7. For all sufficiently large values of b, the coefficient of z; in (21) is in 
modulus bounded below by a positive number M; independent of j. 

Proor. The coefficient in question is “gi,. When is 
large the terms in this summation for which s(k) ¥ n are small, because of (22) 
and Lemma 5. The remaining terms are near , (for 
some the modulus of which is at least MP axis), and this by 
Lemma 5 is bounded below by a positive number independent of 7. 

Lemna 8. [If b is large equation (19) is satisfied by exactly n distinct formal power 
series + u’, (t = 1,2, 2). 

Proor. By Lemmas 6 and 7, (since the coefficient of z; in equation (21), being 
bounded from below, cannot be zero). 


Part III. Anatyticiry oF DoMINANT FUNCTIONS 


Lemma 9. Let Z;, (j = 1, 2, ---) be defined by the recursive relations 
(23) Z;Ms = + P(M2, Zia, 1, | 


Then | | Z;, = 1,2,---,a;j = 1, 7 = 1, 


o=0,1,---,9). 

Proor. This follows from (21), (22), Lemma 7, and the fact that the coef- 
ficients in the polynomial P; are positive. (A formal proof can be given by 
induction on j.) 

Lemma 10. Let Z(u) be the formal power series >,7-1 Zju’. Then Z(u) satisfies 
the following algebraic equation 


M.Z(u) = R(u) — RO) + [Ax(u) — + 


+ + Z(w)) — — 
where 

(25) = 

and 

(26) Ax(u) = = 1,2 


Proo 

j=l 
Hence 

j=1 

But (2 
M 
Hence 
sufficie 

(27) 

and 
(28) | 
PRoor. 
is almc 
is 
LEMMA 
out the 


NON-LINEAR DIFFERENCE EQUATIONS 383 


Proor. From (19), (20), and (21) it is evident that 


m \ 8%) 
= 


j=l 


Hence 


I=. 


( | ani |) @2 + > 


k=l j= s=1 j= 


-> wd | oxo | M3" Z; s(k) — | oxo | 


j=1 


M3 s(k)Ax(0)Z(u) — A,(0) M3 


(Ax(u) — Az(0))(Me + 


+ Ax(0)[(M2 + Z(u))* — — M3). 
But (23) implies that 
M;Z(u) = R(u) R(0) + Zi Zo Zj-1 1, | aug 


Hence Lemma 10 is established. 
Lemma 11. For every positive ¢ there is a positive number D(e) such that for all 
sufficiently large values of b 


(27) | R(u) — R)| when |u| <b — D(e), 
and 
(28) | Ax(u) — Ax(0)|<¢€ when |u| <b-—D(e), (kK =1,2,-+-,m). 


Proor. r(x) is almost constant, by hypothesis. Since by hypothesis a,(xr) 
is almost constant, it follows from Lemmas II, III, VI of the appendix that 
a.(2) is almost constant. 

Lemma 12. For every positive 6 there is a positive number D,(5) such that through- 
out the region |u| < b — D,(6), for all sufficiently large values of b, every coef- 


0 
n 
| 
) 
| 
f- 
és 
1 


384 WALTER STRODT 


ficient of (24), considered as an equation in Z(u), is within 6 of the corresponding 
coefficient in the equation ~ 


(29) MsX = |a|((M2 +X)" — — M?), 


(k)=n 


considered as an equation in X. 
Proor. If ¢ is small, and D(e) is chosen as in Lemma 11, the expressions 


R(u) — R(O) and A;x(u) — A;(O0) appearing in (24) will be small throughout: 


the region |u| < b — D(e). Also, when 6 is large, Ax(0) is small whenever 
s(k) ¥ n and near | a, | whenever s(k) = n, (by Lemma 5). From these re- 
marks Lemma 12 follows at once. 

Lemma 13. Equation (29) has a non-zero discriminant. 

Proor. Letting Y and r be defined by Y = X/M2, and r = 
M3/Mz*>osaan | ax | , we obtain from (29) the equation 


(30) ry =(1+ Y)"—nY-—-1. 


To prove that the discriminant of (29) is not zero it suffices to prove that (30) 
and the derived equation 


(31) r=nil + 


have no common solution. Assuming the contrary, let us suppose that Y is a 
solution of both (30) and (31). Then 


(32) (r+ = (1+ Y)" -—1, 
and 

(33) 

so that Y is the positive number r/(r + n)(n — 1), and 
(34) where y=7/n. 


Hence 1 + Y = (1+ 7)’, assuming tr = (n — 1)". Then, from (32), 
(ny +7) 

or 

(35) al + +7) 


Now the left-hand member of (35) is zero when y = 0, and its derivative with 
respect to y is n(r + 1)(1 + y)’ —n — nr(1 + y)""”, hence, (since nr — 1 = 7), 
is the number n(1 + y)" — n which is positive when y is positive. Thus (35) 
is false, since y is positive. This contradiction establishes Lemma 13. 

Lemma 14. There is a positive number Do such that when b is large equation (24) 
has n distinct solutions Z(u), each analytic in the region |u| <<b— Dy. Thus the 
series >.¥ Zu’ converges in the region | u| <b — Dy to an analytic function of u. 
Proor. Since the discriminant of equation (29) is different from zero, and since 


the di 
the 
large, 
|u| < 


Pa 


LEMM. 
tions 2 
borhoo 
large v 
Eve 
Whe 
of the : 
the sux 
tends 
PROOF 
t= 1 
verges 
solutio: 
Let 


(36) 


and let 
(37) 


Then 1 
| 
tinct se 
solutior 
Lemma 

By I 
Let Zy( 
tie in | 

Becai 
it folloy 
sufficier 
b Dy 
the set. 


(38) 


ut 
er 


0) 


NON-LINEAR DIFFERENCE EQUATIONS 385 


the discriminant of a polynomial is a continuous function of the coefficients of 
the polynomial, it follows from Lemma 12 that when Dy and b are sufficiently 
large, the discriminant of (24) will be different from zero throughout the region 
|u| <b — Do. From this Lemma 14 follows at once. 


Part IV. ANatytic SOLUTIONS OF THE APPROXIMATING q-DIFFERENCE 
EQuATION 


Lemma 15. For all sufficiently large values of b, there are exactly n distinct solu- 
tions y(x) = y‘(x, b), (t = 1, 2, --- , n), of equation (12), analytic in the neigh- 
borhood of x = b. There are positive numbers D and M such that for all sufficiently 
large values of b, y“? (x, b), (t = 1, 2, --- , n) is analytic in the region |t—b|< 
b — D, and there satisfies the inequality | y‘? (x, b) | < M\a|?!". 

Every function y (x, b)a~?!, (t = 1, --- , n), is almost constant. 

When b is large, the numbers y‘” (b, b)b-”'" are n distinct numbers, one near each 
of the numbers o1 , 02, +++ , on, (defined in Lemma 6). (It may be assumed that 
the superscripts are so assigned to the solutions that y‘° (b, b)b-”’", (t = 1, --+ , n), 
tends to o, as b > ~.) . 
Proor. By Lemma 8 there are exactly n distinct formal power series 
+ SP zhu’, = 1,---, n), which satisfy (19). Since | < |Z;|, 
((= 1,---, n), it follows from Lemma 14 that each of these power series con- 
verges to a function analytic in the region | u| <b — Dy. There are no other 
solutions of (19), analytic in the neighborhood of u = 0. 


Let 
and let | 
(37) y (x, b) = 2 (a, b)x””, =1,---,n). 


Then the functions z‘?(x, b) are n distinct solutions of (15) analytic. in 
|2 — b| < b— Dp, and the functions y‘ (zx ,b) ave as a consequence n dis- 
tinct solutions of (12), analytic in |x — b| <b — Dy. There are no other 
solutions of (12), analytic at 2 = b. This establishes the first statement of 
Lemma 15. 

By Lemma 13, equation (29) has n distinct solutions X,, X2,+--,X,. 
Let Z;(u), Zo(u), --- , Zn(u), be the n distinct solutions of equation (24), analy- 
tic in | w| <b — Do, (see Lemma 14). 

Because the roots of a polynomial are continuous functions of the coefficients, 
it follows from Lemma 12 that for every positive ¢, if D2(e) and b are both 
sufficiently large, with Do(e) > Do, there is for every wu in the region |u| < 
b — D.(e) and for every w, (w = 1, 2, --- , n), a number X(u, w) belonging to 
the set X,, --- , X, and satisfying the inequality 


(38) | Z.(u) — X(u, w)| <e. 


_| 
ith 
35) 
the 


386 WALTER STRODT 


Since Z,,(u) varies continuously with u, and X(u, w) can assume only the discrete 
values X,, X2,---,X,, it follows from (38) that if ¢ is sufficiently small, 
X(u, w) is a constant function of u. This implies that the subscripts of 
Z(u) , Z(u) , --- , Z,(u) can be chosen in such a way that 


(39) | Zw(u) — Xw| (w = 1, 2, ---,n), 


whenever | u| <b — D2(e). We assume that this choice of subscripts is made. 
There is precisely one solution Z,,(u) of (24) which vanishes when u = 0. Let 
us assume that Z,(u) is this solution. Then from (39) it follows that | X1| < 
whence X,; = 0. Hence from (39) again it follows that 


(40) | Zi(u) | whenever |u| <b — Dp(e). 


Now Z,(u) is the only solution of (24) which vanishes when wu = 0, and hence 
must be identical with the function }>? Z,u’, since the latter, also, is a solution 
of (24) which vanishes when u = 0. Thus 


1 

Since | z{ | < Z;, (( = 1,2, ---,n;j = 1, 2, ---), it follows that 

(42) 
1 


when |u| < b — Dhy(e). 

This implies that (x, b)x ”” is almost constant. 

By equation (22), |¢°| < Ms. Hence it follows from (42) 
that | ¢° + | < M, or | b) | < M|2x|”", when |2z b| < 
b — D, provided D > D.(1), and M = M, + 1. 

Since y‘(b, = 2°°(b, b) = (t = 1, 2, , n), the final statement 
of the lemma follows from Lemma 6. 


(41) <e when |u| <b — Dzp(e). 


< 6, (¢ = 1,2,---,n), 


Part V. ExIsteENcE oF ANALYTIC SOLUTIONS OF THE DIFFERENCE EQUATION 


Lemma 16. There exist at least n distinct solutions y (x), (t = 1, 2, +--+, n), 
of the difference equation (1), each analytic in the region R(x) > D, and there satis- 
fying the inequalities | S$ M|x|”", (t = 1, 2,---,n). When x tends 
to «© along the positive real axis, tends to = 1, 2, +--+, n). 
Proor. Let B be a bounded region which with its boundary is included in the 
region R(x) > D. There is a positive number by such that the region | x — b| < 
b — D will contain B provided b > bo. 

Let {bx}, (k = 1, 2, ---), be a sequence of positive numbers increasing to in- 
finity, with b} > bb. Let ¢ be any one of the integers 1,2, --- ,n. The sequence 
of analytic functions {y‘’ (x, b,)}, (k = 1, 2,---) is a bounded family in B, 
because of Lemma 15. Hence there exists a subsequence {c,} of the sequence 
{b.}, such that the sequence of functions (y‘? (x, cx)} converges to a limit fune- 


tic 
By 
sec 
tic 
fu 
me 
(4: 
| wh 
val 
of 
] 
< 
] 
con 
val 
(44 
whe 
(45, 
for | 
as | 
(46) 
Thu 
(t = 
y? ( 
Lem 
anal 
some 
(47) 
wher 
(48) 


on 


NON-LINEAR DIFFERENCE EQUATIONS 387 


tion analytic in B, the convergence being uniform in every closed subset of B.” 
By the standard device of expressing the region R(x) > D as the union of a 
sequence of such regions B, each region including the preceding, the limit func- 
tion can be continued analytically throughout the region R(x) > D, so that a 
function y“° (x) is obtained, analytic in the region §t(x) > D, and satisfying the 
relation 


(43) = lim (x, di), 


where {d,} is some subsequence of {b,}, where it is understood that for efth 
value of x in R(x) > D the function y‘ (x, d,) is defined only for sufficiently 
large values of k, and where the limit is uniform in every closed bounded subset 
of R(x) > D. 

Now y‘?(z, b) is a solution of (12). Hence, since the g;., appearing in (12) 
tend to 1 as b tends to infinity, it follows that y“° (x) is a solution of (1). 

Since | y‘(x, b) | < M in |x < b — D, it follows that | y“°(z) | 
< M |x|" in R(x) > D. 

Let € be any positive number. Since by Lemma 15 y‘?(z, b) 2 ”’” is almost 
constant, there is a positive number D(e) such that for all sufficiently large 
values of b, 


(44) | y (x, — | < 
when |x — b| < b — D(e). Hence if x is real and greater than D(e), 
(45) (wo, — y (de, | < «, 


for all sufficiently large values of k. By Lemma 15, lim y‘? (dx, dk)dy”’” = 01, 
ask— ©, By this and (48), it follows from (45) that 


(46) | — Se. 
Thus, as x tends to « along the positive real axis, y‘(x)x~”" tends to a, 
(¢= 1,---,n). Since the numbers og; are distinct, it follows that the functions 


y (x) are distinct. 


Part VI. APPROXIMABILITY, UNIQUENESS, AND ALMOST CONSTANCY OF 
SOLUTIONS OF THE DIFFERENCE EQUATION 


Lemma 17. Let yo(x) be any solution of the difference equation (1) which is 
analytic in R(x) > D and there satisfies the condition | yo(x) | < M|2x|"'", for 
some positive numbers D and M. Then yo(x) satisfies the q-difference equation 


m s(k) 
(47) ax(x) IT yo(quet + one) = ¥(2), 
where 
(48) 


7 Montel, loc. cit. 


te 
ll, 
of 
et 
. 
ce 
< 
ON 
n), 
nds 
in- 
nce 
nee 
ine- 


388 WALTER STRODT 


is almost constant, and y(b)b-? + c as b > ~, and the numbers qxs are defined as 
in Definition 1. 
Proor. By hypothesis 


m 8 (k) 


I yo(r + wis) = 


k=1 


Hence 
m 8(k) 
a,(z) + wks) = 9(x) 
(49) = s= 


+ ax (2) | Yo(QrsX + — I yo(x + ow) |. 
Thus we obtain (47), with 
m 8(k) 8(k) 
(50) (2) = ote) + ante) | TT + om) TT + | 


and properties (48) to be established. Let 


(51) d(x) = ¥(x) — ¢(z). 

Let . 

(52) - e(z) = d(x)x ”. 

Then 

(53) e(zx) a, ™ (zx), 
where | 


s(k) 


8s(k) 
(54) = Yo(qket + wis) — I] + ow) |, (k = 1, ---,m) 


By Lemma XI, hy(2x) is almost constant, and lim h,(b) = 0,asb — ~. 

By (7), if p < 0, there are no values of k for which s(k) # n. Hence, if p S 0, 
> 0, is bounded as x ~, and is almost 
constant by Lemma III. 

By hypothesis, a;(z) is almost constant, and bounded as x > + ». 

Hence lim)... e(b) = 0, and by Lemmas VI, V, e(x) is almost constant. 

Now (x) x” = g(x) a”? +(x). By hypothesis g(x) x” is almost constant 
and lim. g(b)b” = c. Hence y(r)x” is almost constant and lim,» 
y(b)b” =e. 

Lemma 18. Let e(x) be the function (W(x) — Define u = x — 4, 
= w’, and 


(55) E(u) = wv’. 


when 


and tl 
the na 
Proo: 
for th 
(12), 
and 
it foll 
equati 


The 
(56, 
(k= 
(57) 

Let 
lim 
num 

(58) 
whe 
Cr 
is nc 
we h 
that 

(59) 
wher 
Nc 

(60) 
and |] 


NON-LINEAR DIFFERENCE EQUATIONS 389 


Then there is a positive number D, such that 
(56) lim max (| E(u) |;|u| = b — Di) = 0. 


Proor. Let h(x) be the function defined in (54). Define hy(xz) = SOP hiju’, 
(k = 1,+--+,m). Then by Lemma XI there is a positive number D; such that 


| hag | 


Let ax(x) = )ofa,ju’. Then, since by hypothesis a;(zx) is almost constant, and 
lim a,(b) = a;,, it follows that if D, is sufficiently large there is a positive 
number such that 


(57) tim max ( Sb —D,) = 0. 


0 


< M, 


(58) 


when |u| b — Dy. 

Consider the factor 2 ”‘"*“!” appearing in (53). By (7) the exponent of x 
is not positive. Hence if we define = with = ju’, 
we have by Lemma III and the fact that d;(b) is bounded for large b, the result 
that if D, and M are sufficiently large, 


(59) < (k = 1,-+-++,m), 
when |u| <b — Di. 
Now 
(60) | E(u)| E(\ul), 
and by (53) 


whence from (57), (58), (59) and (60) follows 
lim max (|E(u)|; |u| $6 — Di) = 0. 
b—>00 


Lemma 19. Let yo(x) be as in Lemma 17. Then y(x)x”’” is almost constant, 
and there is a value of t such that limy+.2 yo(b)b-”'" = o¢, where o1, +++, on are 
the numbers defined in Lemma 6. 

Proor. By Lemma 17, y%(x) satisfies equation (47). Since the hypotheses 
for the coefficients of (47) are identical with those for the coefficients of equation 
(12), and since the numbers oi, --- , ¢, are defined in terms of the numbers 
cand a, , (k = 1, --- , m) which have the same significance for (47) as for (12), 
it follows that Lemma 15 remains valid with reference to the solution yo(x) of 
equation (47), instead of to the solution y“ (x, b) of equation (12). 


390 WALTER STRODT 


Hence yo(x)a ”’” is almost constant, and if 6 is large, yo(b)b ”” is near one of 
the numbers o1, --- , on. Which one of these yo(b)b ”’” is near might conceiv- 
ably depend upon b, but yo(b)b-”’” varies continuously with b, and therefore 
there is a fixed value of ¢ such that yo(b)b-”"” is near o; when b is large. 

Lemma 20. Let yo(x) be as in Lemma 17. Let t be the number exhibited in 
Lemma 19, such that lim = o,asb— Let y“ (a, b) be the function 
described in Lemma 15. Then there is a positive number Dz such that 


(61) yo(r) = y“ (x, b), 


when R(x) > De, the limit being uniform is every closed bounded subset of the 
region R(x) > De. 
Proor. Define a(x) = y(x)a””", and = y?(x, Since 
(x, b) satisfies (12), z(x) satisfies (15). Likewise, since yo(x) satisfies (47), zo(2x) 
satisfies 

s(k) 


(62) Zo(Qust + ws) = p(2) 


where p(x) = ¥(x)x ”, and the a,(x) are defined by (16). 
Subtraction of equation (15) from equation (62) gives 


s(k) 8(k) 
(63) ax(2) [i + wis) — + on) | = e(x) 


where e(2) is defined by (51) and (52). 


Hence 

(64) + om) — + ond] = (2), 
where 

(65) Cis(X) = + i, + 
Let (66) v(x) = z(x) — 2(x). Then 

(67) + om) = 


Let ces(t) = Cee’, v(x) = vj u’, = ej u’, where u = x — b. 


Then, by (67) and Lemma 1, 


(68) > > Cisj U w) = eu’, 


k=1 s=1 \j=0 


and therefore 


(69) 
The | 
Now 
Cks0 = 
Hene 
all th 
indep 
where 
= 
be de 
(72) 
Then 
(73) 
by (6¢ 
Def 
(74) 
or 
| 
Now 
De 
large \ 
(76) 


NON-LINEAR DIFFERENCE EQUATIONS 391 


m s8(k) 


k=1 s=1 =0 
(69) m 

k=1 s=l \j=1 
The coefficient of uw’ in the left-hand member of (69) is v dj, where, by definition, 
m 
(70) dj = . 
Now from (65) follows 
s—1 s—l 


Ciso0 = Cke(b) = ax(b) 2(b) 20(b) = ax(b) lye, iL 


i=s+ 


Hence, by Lemma 5, if s(k) ¥ n, ciso ~O0asb— «©. Therefore, when b is large, 
all the terms in d; for which s(k) # n are small. The remaining terms are near 
agi, , the modulus of which is at least MP RD wan 
> 21 agis), and this, by Lemma 5, is bounded below by a positive number 
independent of 7 and b. Thus 


(71) | d;| > 
where M, is a positive number independent of j. Let Cis; = | ces; |, Bj; = | e; |, 
(k = i, = 1, , 8(k); 7 = 0, 1, 2, coe). Let V; (j = 0, 1, 2, 
be defined by 
m_ s(k) 
Then 
(73) $V;,G =90,1,2,---), 


by (69) and (71). (A formal proof may be given by induction on j.) _ 
Define V(u) = Doo E(u) = Ew’, = Crsju’, (k = 1, 
,m;s = 1,---,s8(k). Then from (72) it follows that 
m 8s(k) 


(74) MiV(u) = E(u) + cw) | 


or 


m 8(k) 
(75) = Bu) / [ - | 


Now the function z(x) is almost constant, by Lemma 15. Thus, if z(7) = 
de z,u’, there is for every positive ¢ a positive D(e) such that for all sufficiently 
large values of b, 


(76) lal <6 


392 WALTER STRODT 


when |u| <b — D(e). Now + Wis) = Doo zig’, (k = 1,-++, m; 
s = 1,---, s(k)), by Lemma 1, and it follows from (76) that 
| Soo | | u? — | | < when |u| <b — D(e). Hence + wis) is 
almost constant, (k = 1, ---,m;s = 1, ---, s(k)). 

Likewise the function 2(x) is almost constant, since yo(x) is a solution of (47), 
and therefore every function 20(q:s% + w:s) is almost constant. 

By (28), ax(x) is almost constant (k = 1, ---, m). 

Hence, by Lemma VI, (x) is almost constant, (k = 1,---,m;s =1,---, 
s(k)). 

This implies that there is a positive number D» such that 

m 8(k) 1 


(Cis(u) — <5 Ma, if|u| b— De. 


Thus (75) implies 
(77) | V(u) | < 2| E(u) | when |u| S — Dz. 


Hence from Lemma 18 follows max (| V(u) |;| w| < 6 — D2) ~O0asb— a, 
provided Dy, is taken greater than D,. 

Therefore from (73) follows max (| v(x) |; | — b| b — D2) ~Oasb— 
or max (| 2o(z) — 2(x) |;|2 —b| Sb — D2) ~Oasb— ~~. 

This implies that (61) is valid with the limit uniform in every closed bounded 
subset of the region R(xr)>> De. 
Lemma 21. If D is any positive number there are at most n distinct solutions of 
the difference equation (1), analytic in the region R(x) > D, and there satisfying (9). 
Proor. Assume that there are more than n such solutions, say y:(x), y2(x), --*, 
yx), with] > n. Then by Lemma 20 there is a number (2) in the set (1, 2, ---, 
n) such that 


(78) yi(x) = lim y“ (a, b) as b > 
But there must be two values of 7 for which ¢(7) has the same value, and this 


implies by (78) that two of the functions y;(z) are identical. This contradiction 
establishes Lemma 21. 


Part VII. Summary 
Theorem 2 follows immediately from Lemmas 16, 19, 20, and 21. 


Appendix. Almost Constant Functions 


1. Derrnition or ALMost Constant Functions. See Definition 2, above. 

2. Noration. [ff is analytic at x = b, and can therefore be expanded in a Taylor's 
series B(x — b)’, the function | B;| w’ will be denoted by the symbol 
fa(u, 

3. Lemma l. Let f(x) be analytic in a half-plane R(x) > D > 0, and satisfy there 
the inequality | f(x) — f(b) | < M(b + |x|") for some M and all sufficiently 
large values of b. Then f(x) ts almost constant. 


PRoo 
Let b 
Then 


| fa(u, 


If Do : 
almost 
4, Le 


| 

non-ne 
for po 
PROOF 
Then ; 
and a 
a=] 
The 
| f(x) 
comple 
plane ‘ 
PRoor 
b> 
= 
Hence 
=|c| 

b—D, 
|a|)~° 
Lan 
theorie, 


NON-LINEAR DIFFERENCE EQUATIONS 393 
Proor. Let f(x) = }0308,(x — b)’. Let Do be any number greater than D. 
Let b be any number greater than D)». Let |u| < b — 2D), p = b — Do. 
Then 


|fa(u, b) — fa(0, b) |? = |B; | 


| | 8; | p?(u/p)? 


S] + |b + + [b+ pe” ao] bDo" 


[ + |b + pe” |’) ao] 

= 2M’ + — S [b* + Do’. 
If Do is large, the last member is small, for all large values of b. Hence f(x) is 
almost constant. 
4, Lemma II. Every function of the form (a + B/x)*, where a is positive, B is 
non-negative, s is real, and the branch chosen (in the right half-plane) is positive 
for positive x, ts almost constant. 
Proor. Let f(x) = (a+ B/x)*. Let x be any number such that | B/x | < a/2. 
Then f(x) — a = (a + B/x)* — a® = (a + pe’) — a’, where a non-negative p 
and a real # are defined by pe” = B/z. Let F(p) = (a+ pe)’. Then f(x) — 

Pp 

Therefore | f(x) — a* | S$ pM M, |x|" for suitable constants M, M, inde- 
pendent of p, |x| respectively. Likewise | f(b) — a° | S Mi Hence 
\f(x) — f(b) | S Myo" +.|2|). By Lemma I, f(z) is almost constant. 

5. Lemma III. Every function of the form (x — a)“ where a is an arbitrary 
complex number, where o is non-negative, and where the branch chosen (in the half- 
plane R(x) > R(a)) is any branch, is almost constant. 

Proor. We may and do assume that o > 0. Let f(z) = (« — a)”. Let 
b>|a|. Lette =b—a. Thenf(x) = (x—b+c)* =c*(1 + (@—b)/c)” 
= (—1)"(—0)(—o — 1) — + Iw’ = 
u) Hence fa(u, b) — fs(0, b) = — — Hence if |u| < 
b — D, where D is any number greater than | a |, | fa(u, b) — fu(0, b) | < (D — 
|a|)* + (b—|a])’. If € is any positive number, D can be chosen so that 


8 Landau, ‘‘Darstellung und Begrundung einiger neuerer Ergebnisse der Funktionen- 
theorie,’’? Berlin 1929, page 8. 


394 WALTER STRODT 


(D — |a|) < ¢/2, and then for all sufficiently large values of b, | fa(u, b) — 
fa(0O, b) | < Hence f(x) is almost constant. 

6. Lemma IV. If f(x) is almost constant, f(b) tends to a limit as b tends to 
positive infinity. 

Proor. Let ¢ be any positive number. Let D(e) be such that | fa(u, b) - 
f4(0, b) | < e when |u| < b — D(e), provided b > Choose bz greater than 
b; and greater than D(e), and let bs = be. Then | fa(u, bs) — fa(O, bs) | < «, 
if |u| < bs — D(e). A fortiori, | f(x) — f(bs)| < «if |x — bs| < bs — D(e), 
In particular, | f(b.) — f(bs) | < «, since | b, — b; | < b; — D(e). Hence lim f(b), 
as b > ©, exists. 

7. Lemma V. [f fi(z, b), --- , f(x, b) are almost constant, so is f;(x, b) + --- 
+ fr(x, b). Proof omitted. 

8. Lemma VI. [Jf fi(x, b), --- , fy(x, 6) are almost constant, so is the product 
filx, b)fo(a, b) --- f(a, b), provided | fi(b, b), --- , | fy(b, b) | are bounded by a 
number M. 

Proor. Case 1. N = 2. Let f(x, 6), g(x, b) be almost constant. Let 
h(x, b) = f(x, b)g(x, b). To prove h(x, b) almost constant. 

Let ¢ be positive, and let D(e) be such that | fs(u, b) — f4(0, b) | < € and 
| ga(u, b) — ga(O, b)| < ¢€, when |u| < 6b — D(e), (for all sufficiently large 
values of b). Then | ha(u, b) — ha(O, b) | S ul, b) — haO, S fall 
b)ga (| |, b) — fa(0, b)ga(O, = [fa(| u |, b) — f4(0, b)] [ga(| u |, 6) — ga(O, b)] + 
$4(0, b) [ga(| u|, b) — gaQ, b)] + ga, b) [fa(| w |, — fa(0, b)] < + Me + 
Me. This quantity can be made small by taking « small. Hence h(z, b) is 
almost constant. 

CasE 2. N 2 2. Proof by induction. 

9. Lemma VII. Every rational function whose numerator-degree is not greater 
than its denominator-degree, 1s almost constant. 
Proor. Let f(x) be such a function. Then, since f(x) can be expanded in 
partial fractions, there is a relation f(x) = ¢ + >>jx¢j(x — ax)’, where the 
a; are the poles of f(x) and j is always positive. Now the constants ¢ , cj, are 
obviously almost constant, and each of the functions (x — a,) ’ is almost con- 
stant, by Lemma III, so by Lemmas VI and V, f(x) is almost constant. 

10. Lemma VIII. Let f(x) be analytic in R(x) > D > 0, and there satisfy the 
inequality | f(x) | < M | x |? for some positive M and some real p. Then there is a 
positive constant M, such that | f’(x) | < My |x |"[R(x) — in R(x) > D. 
Proor. Let x be any complex number such that R(x) > D. Let C be the 


circle ¢ = « + pe”, where p = (R(x) — D)/2. Then f’(x) = (2mi)” / fo) 


(¢ — x) Hence | f’(x)| < (2r)’ M-M(x)p° = M M (x)p where 
M(z) is the maximum over C of the function | ¢ |”. Now on C,|¢| > |x| - 
|a|/2,and|¢| <|x2|+|2|/2. Hence, if p = 0,|¢|? (3/2)? | x |’, and if 
p < 0,|¢|? S (1/2)? |x|”. Thus there is an M, such that | f’(x) | < Mi |x|’ 
[R(x) — 

11. Lemma IX. Let w, D be positive numbers such that D > w. Let f(x) be 


analyte 
D)" fe 
numbe: 


+ 
PROOF 


belong 


tox + 
M’ ist 
betw 
+ 
M2(1 - 
12. Lx 
and b 1 


13. LE 
equality 
be no 


G = 1, 


satisfie. 
h(x, b) 
M 
PROOF. 
satisfy 
Now b 
Hence 
PAL 
n=0 
Hence 


NON-LINEAR DIFFERENCE EQUATIONS 395 


analytic in R(x) > D, and there satisfy the inequality | f'(x) | < My | x |? (R(x) — 
D)" for some positive M,. Then there is a positive number Mz such that if b is any 
number greater than 3w, and if gq = 1 — w/b, then the inequality | f(x + w) — 
f(qx + w) | < Mz (1 — q) | x |?" /R(z) is valid for all x such that R(x) > 2D. 

Proor. R(qx + w) = qR(x) + w > (1 — 1/3)(2D) +a > D. Hence gr 
belongs to the half-plane R(x) > D. Then f(x + w) — f(qz + w) = / 


f'(Od¢, where the integral is taken along the straight line segment joining gx + w 
tox+w. Thus | f(x + w) — + w) | S| + w) — (qa + w) | M’, where 
M’ is the maximum value of | f’(¢) | on the line segment. Thus for some number 
\ between q and 1 

— q) |x|? for some suitable Mo. 

12. Lemma X. Let M, D be positive numbers. Let h(x, b) be a function of x 
and b which, for all sufficiently large values of b, is analytic in R(x) > D, and there 
satisfies the inequality | h(x, b)| < Mb™* | 2 | /R(z). Let the Taylor’s series for 
h(x, b) in powers of x — h,(b) (« — b)." Then | 9 | ha(b) | u" | 
M D*"o"4 when | u| < b — 2D, (for all sufficiently large values of b). 

Proor. Let 6 be any number greater than 3D. Let p equal b — D, and let u 
satisfy the inequality |u| < 6 — 2D. Then 


< | + pe’ [R(b + do ao] 


X — Jul’). 
Now bp’ < 1, and p’ — |u|? = (b — D)’ — (b — 2D)’ = 2bD — 3D’ > bD. 
Hence 


n(b) | u" 


(27) | |b + pe’ |? [9(b + pe”)? ao] 


= (2xbD) — = — D*)*. 


Hence 


13. Lemma XI. Let f(x) be analytic in R(x) > D > O, and there satisfy the in- 
equality | f(x) | < M | x |’ for some positive M and some real oc. Let , 
w, be non-negative numbers, and let qi , qs be defined by q; = 1 — 


(j = 1,---, 8). Then the function h(x), (of x and b), defined by h(x) = x” 


| 
t 
| 
| | 
| 
l 
4 | 
) 
| | 
f 


396 WALTER STRODT 


f(x + — + is almost constant and h(b) > 0, as b 
Moreover, if h(x) = — b)", and if H(x) = | ha | — b)", then 
there is a positive number D’ such that max (| H(x) |; — b| — D’) 390 
asb— ~, 

Proor. By Lemmas VIII, IX, there is a positive constant Ms such that 
| f(x + — + | < |x for all sufficiently large 
values of b, provided R(x) > 2D, (¢ = 1, 2,---, 8). Now 


t-—1 8 
h(a) = [fe + — fae +o) faz + 
Hence by Lemma IX, ; 
8 t— 8 


< M3b™ | x |/R(x), when R(x) > D’, for some suitably chosen positive numbers 
M; and D’. Hence Lemma X is applicable, to prove that h(x) is almost con- 
stant, that h(b) 0 as b— «, and that max (| H(x) — b| < b — D’) 0, 
as b > 


CotumBIA UNIVERSITY. 


ANNALS © 
Vol. 44, N 


(C, a) 
for (C, 
object 
vergen 

Let « 
series | 


(1.1) 
and let 
(1.2) 


Then 1 
If 


and 


1Ha 
2Wa 


satt 

Wit! 

lowing 

THE 

: holds t 
to sum 

Put 

inequa 

It is 

2. I 

|| 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


A NOTE ON CESARO SUMMABILITY OF FOURIER SERIES 


By Fu Trainc WANG 
(Received December 28, 1942) 


1. It is known that the (C, a) continuity of a function is not sufficient for the 
(C, a) summability of its Fourier series at a point. Many sufficient conditions 
for (C, a) summability of a Fourier series were found by many authors.’ The 
object of this note is also to find a sufficient condition which arises from a con- 
vergence criterion for Fourier series recently proved by the author.’ 

Let ¢(¢) be an even integrable periodic function with period 2z and its Fourier 
series be 


(1.1) o(t) ~ 5 + a, cds nt, 
and let us suppose 


Then we have the following convergence criterion. 


If (1.2) holds anda, > — Ky, — ; then the Fourier series (1.1) converges to sum 


satt = 0. 
Without the order condition in this convergence criterion we have the fol- 
lowing 


THEOREM. Let n be a positive integer such thatn = 8B > n — 1. Mf (1.2) 
holds then the Fourier series (1.1) is summable (C, (y(n — 1) + B)/(y +n — B)) 
tosum satt = 0. 

Put a = (y(n — 1) + 8)/(v + » — 8); then we can easily verify the following 
inequality: 

B>ar>n-—l. 


It is evident that the theorem is not true for the case B = y. 
2. In order to prove the theorem we put 


Ca(w) = + — n)*a, 


n<w 


and 


Ya(w, t) = [ (w — x)* cos at dx 


1 Hardy and Littlewood [3]. Gergen [4]. 
Wang [5]. 
397 


i 

0, 
i 
at 
| 
|’ 
rs 
0, 

|_| 


398 FU TRAING WANG 
Then we get’ 

tr a a 
(2.1) Ca(w) = $(t)va(w, t) dt + so” + 


Concerning the function y.(w, t) we have the following properties. 
Lemma 1.‘ Let K; be certain constants depending only on B; then 


Ya" (w, t) = {ya(w, = (a—n+ "ya-n(w, t) 
(2.2.1) 
‘ + ky "ya-r(w, t) 
and 
(2.2.2) Ya-n(w,t) = for all w and t 
and 


cos | wt (a 


(2.2.3) ya-n(wt) = —n + 1) + 


for wt = 1. 
Lemma 2. Let w’ = and n= 3 Uf dn(t) = o(t”") then 
cos [at (a—n+1)| 
2.3) Caw) = 


+- w*) ao" 0(w*). 
Proor. From integration by parts and (2.1) we get 


dt 


241) Colo) = £2 dt + + 06"), 


since 


/ dt = for 1 
0 


By Lemma 1 and (2.4.1) we get 


cos | 2+ 
dt 


(2.4.2) Caw) = (-1)"2 


+ sw* + o(w*). 


3 Gergen [4]. 
4 Bosanquet and Linfoot [2], and Bosanquet [1]. 


By 


(2.4.3) 


LEM 


then 
(2.5.1) 
and 


(2.5.2) 


for wr 
Pro 


(2.6.1) 


(2.6.2) 


By : 


(2.6.3) 


5 Tit 


CESARO SUMMABILITY OF FOURIER SERIES 399 


By the second mean value theorem then 


cos at (a—n+0)] 


(2.4.3) on(t) dt = O(p 

From (2.4.2) and (2.4.3), Lemma 2 follows. 

Lemma 3. Put 

cos E ~ »| 
E(w, u) = [ (¢— dt; 

then 
(2.5.1) E(w, u) = O(u”**") for all w and u, 
and 


cos wu — F(a +8 
(2.5.2) E(w, u) = T(n — yeti 


+ O {(w!" + O(u"* w’) 


for wu 2 1. 
Proor. By the second mean value theorem we get 


cos | wt (a—n+1)| 


From changes of variable and the second mean value theorem we then obtain 


(2.6.2) [ Ez +) 5 (a-—n+ dv 


+ 
By a theorem on I-functions’ we get 
[ cog Ez + yr) - 5 (a —n+ »| dy 
(2.6.3) 


cos | wu F(a + 
= T'(n — 8) (uw)*-8 


5 Titchmarsh [6]. 


| 


400 FU TRAING WANG 


A combination of (2.6.1), (2.6.2), (2.6.3) will give the proof of the lemma. 
PROOF OF THE THEOREM. If (1.2) holds then 
= o(t*). Hence ¢,(t) = o(¢”). 


and 
— 8) 


If we substitute (2.7) in (2.3) and invert the order of repeated integration we 
have, by Lemma 3, 


+ O(p So" + 0(w"), 
By (2.5.1) and (2.5.2) we have 


cos E (at B— 2+ »| 


O(p w*) + aw" + 0(w"). 


Ca(w) 


= sw + 0(w*) as p tends to infinity. 


This completes the proof of theorem. 
Finally, an example shows that the theorem is a best possible theorem of its 
kind for the case 8 = 1. 


NATIONAL UNIVERSITY OF CHEKIANG, 
MeitTan, Kweicuou, 
CHINA. 


REFERENCES 


1. L. S. Bosanquet, “‘A solution of the Cesaro summability problem for successively derived 
Fourier series,’’ Proc. London Math. Soc., 2, vol. 46, 1940, 270-289. 

2. L. S. BosanquEet anv E. H. Linroor, “‘Note on an asymptotic formula,’’ Tohoku Math. 

_ J., 39(1934), 11-16. 

3. G. H. Harpy anv J. E. Lirrtewoop, ‘“‘Solution of the Cesaro summability problem for 
power series and Fourier series,’’ Math. Zeit., 19(1924), 67-96. 

4. J. J. Gercen, ‘‘Convergence and summability criteria for Fourier series,’’ Quarterly Math. 
Journ., 1(1930) , 252-275. 

5. F. T. Wana, ‘‘On Riesz summability of Fourier series I, II,’’ Proc. London Math. Soc., 
vol. 47, 1942, 308-325. 

6. E. C. Tircumarsn, ‘‘Theory of functions,” p. 107. 


ANNALS | 
Vol. 44, 1 


Con 
sequer 
sion, V 
are co 
angula 
where 


(1) 


provid 
the tr: 


We 
(2) 


(3) 


(7) 


The 
is said 
We re 

In o 

Ino 
sufficie 


1 Sih 
Toe 
2 Ha 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


THE INVERSE NORLUND MEAN 


By FLorENcE M. MEaArs 
(Received November 16, 1942) 


1. Introduction 


Consider the series of complex terms, >>%~» un , and denote the corresponding 
sequence by {U,,}; then U, = When there is no possibility of confu- 
sion, we shall use the abbreviated notation >> u, for )>%-0 un. In this paper we 
are concerned exclusively with sequence to sequence transformations with tri- 
angular matrices, which assign to a given sequence {U,} the value lim,..U%, 
where 


(1) = Us, 


provided that limit exists. If U’, > U, }> un is said to be summable to U’ by 
the transformation (1); if >> | w', | converges, when u, = U, — U4-1, then 
> wu, is said to be absolutely summable by (1). 

We list for reference the following conditions: 


(2) lim a,, =0, 
(3) lim & Gn, = C, where C is a constant; 
(4) lim = 1; 
k=O 
(5) | ane | < M, where M is a constant; n=0,1,---; 
k= 
(6) < M, n,k = 0,1, ; 
| 
(7) < M, k = 0, 1, 
n=k! 


The transformation (1) is said to be regular if U', + U whenever U, — U; it 
is said to be absolutely regular if >> | w’, | converges whenever >> | u» | converges. 
We recall the following theorems: 

In order that (1) be regular, (2), (4) and (5) are necessary and sufficient.’ 

In order that U’, > CU whenever U,, — U, (2), (3) and (5) are necessary and 
sufficient.” 


1 Silverman, University of Missouri Studies, (1913), p. 49. 
Toeplitz, Prace matematycznofizyczne, vol. 22 (1911), p. 117. 
2 Hausdorff, Mathematische Zeitschrift, vol. 9 (1921), p. 75. 


401 


402 FLORENCE M. MEARS 


In order that U’, — CU whenever DY | un | converges, (2), (3) and (6) are 
necessary and sufficient.’ 

In order that (1) be absolutely regular, (7) is necessary and sufficient.’ 

In this paper we are concerned particularly with the Nérlund and Cesaro 
means, and with their inverses. The Nérlund mean, (N, p), is obtained by 
replacing of (1) by Px'pr«,n = 0,1, ---,k = 0,1---+, where {p,} isa 
given sequence of complex constants such that P,, = ojo py #0." Since p ¥ 0, 
and since the transformation is not affected if p, is replaced by pop: , we shall 
let po = 1. Then for (N, p), (1) becomes 


(8) Un = Pa’ Ur, n=0,1,-:-:- 


If >> wu, converges, >> u, is said to be summable (N, p); if > | uv’. | converges, 
> un is said to be summable | N, p |. 


OO0---0 


Dn 
The inverse Norlund mean, (N, p)’, is obtained by replacing a,x of (1) by 
(— 1)"* ,n = 071, ---,& =0,1,---,n.° Therefore for (N, (1) 
becomes 


(9) Un, = > (-1)"* Ui, 
k=0 

if >> u, converges, we shall say that >> u, is summable (N, p)’; if > | uw, | 

converges, we shall say that it is summable | N, p | ™*. 

Let (N, p) be the Cesaro mean, (C, r), where r may take on any value except 
that of a negative integer; then p, = [n!C(r + + If, for (C, 
we replace U;, of (1) by US”, we have 

r Ir — T(r +n— 
1 
(10) T(r + n+ 1) (n — k)! 


For the inverse Nérlund mean, (N, = zo. If we let (N, 
= (C, we can easily prove that = (— 1)*r(r — 1) --- —k +1) 
(k!)*, k = 1, 2, --+ ; therefore when r is integral a,,,-, = Oif k > r. If, for 
(C, r), we replace U’, of (1) by 0”, we have 


3 Hahn, Monatshefte fiir Mathematik und Physik, vol. 32 (1922), p. 29. This theorem, 
proved by Hahn for real numbers only, holds also for complex numbers. 

4 Mears, Annals of Mathematics, vol. 38, No. 3 (1937), p. 595. 

5 Nérlund, Lunds Universitet Arsskrift, (2), vol. 16 (1920), No. 3, p. 8. 

6 Hill, Duke Mathematical Journal, vol. 3, No. 4 (1937), p. 705. 


the left 
0, 
therefo 
Since 


we hav 


(11) 
where . 
It is 
of ris § 
regular 
THE 
or the r 
Proc 
| 
Therefc 
r#0,: 
In se 
IC, ry 
theoren 
cation 1 
(13) 
7 Kogl 


re 


8, 


y 
L) 


THE INVERSE NORLUND MEAN 403 


ir) ye +n —k+1)Un4 
where N = r for r integral, if r < n, and N = n in all other cases. 

It is known that (C, r) is regular if and only if r = 0, or if the real component 
of ris greater than zero. We shall prove the corresponding theorem for absolute 
regularity. 

THEOREM 1: The Cesdro mean, (C, 1), is absolutely regular if and only if r = 0 
or the real component of r is greater than zero. 

Proor: For the Nérlund mean, (N, p), condition (7) becomes 


(12) — < M, 


If we let (N, p) = (C, r), r = R + “ip, and if 


A + n) 
Sr) = klr| |’ 
the left side of (12) is equal to f(r). Since (C, r) is absolutely regular for r = 
R20,’ f(R) < M,k =1,2,---,R20. ButforR ~0,f(r) |R'r|f(R); 
therefore (7) is satisfied and (C, r) is absolutely regular if r = 0, or if R > 0. 
Since 


nin 


™ 
we have 
0<m' < nin < M’, 


Tig +n+ 1) 


Therefore f(r) = k |r| m’(M’)* SO%_o [n(k + n)J"', for R < 0, or for R = 0, 
r ¥ 0, and (7) is not satisfied for these values of r. 

In section 2 we shall prove theorems concerning summability (C, r) and 
|C, r|’, when r is restricted to real values; section 3 includes multiplication 
theorems for the Nérlund mean and inverse mean; section 4 includes multipli- 
cation theorems for the Cesaro mean and inverse mean. 


2. Ceséro Summability 
Throughout this section, we shall let 


where M = s for s integral, s S n, and M = nin all other cases. 


” Kogbetliantz, Bulletin des sciences mathématiques, (2), vol. 49 (1925), p. 237. 


ro 
a 
0, 
ill 
| 


404 FLORENCE M. MEARS 
We have 
OY =U, + = O02 +27 [(n + 1)nu, — n(n — 1) uni], 


and in general 
THEOREM 2: [f r is neither zero nor a negative integer, then 


Proor: Let 
S, = nl (r + n — p) 


T(r + 1 — p)(n — p)!p! 
Then 


N-1 


k=0 p=0 


where N is defined as in (11). It can be proved that 


k 
Sp = ( 1) rl'(r — k)ki(n — k — WP? k = 0,1, ,N 1; 
therefore })%=) S, = —Sy and )>*_5 S, = 0. Substituting these results in 


(15), we obtain (14). 

TueoreM 3: If u, is summable (C, r to U, it is summable (C, 
toU,s20,r> 

Proor: For s = 0,r > —1, (C,r +s) includes (C, r);> therefore (C, r + s) = 
A(C, r), where A represents a regular matrix. Since (C, r) and (C, r + s), and 


hence their inverses, are permutable, we have 
(C, ry" = (C,r+ = AC, r)(C,r + 
= +s)? = 
Therefore (C, includes (C, 7 + 

Corotiary 1: If >> u, is summable (C, r)',r> 0, it is summable (C, r — 1)", 
and = 0. 

TuEorEM 4: If >> u, is summable |C, r + it summable |C, r|~, 
s20,r> —-1. 

Proor: Fors = 0,r > —1,|C,r +s | includes | C, r | therefore the matrix 
A, defined in Theorem 3, is shecbunily regular. The rest of the proof is the 
same as that of Theorem 3. 

Tueoreo 5: Jf lim,.. = 0,r>1,n = 1,2, ---, then lim... = 0. 

Proor: Consider the triangular matrix || @mp || , alas by Gmp = 0, when 
r is integral, p > r — 2; dmp = r(m + + p)[(p + +m+ 


in all other cases. It is proved that #9" = Anpt ,m=0,1,- 


8 Knopp, Sitzungsberichte der Berliner Mathematischen Gesellschaft, vol. 7 (1907), 


p. 5. 
9 McFadden, Duke Mathematical Journal, vol. 9, No. 1 (1942), p. 173. 


But 


where 


all othe 
(16) 


Letting 
[T(r + 


and >> 


therefo 


( 


Um 


therefo. 
THEC 
Pr 2 O. 


(17) 


Toe 


and 
Theref 
THE 
then l 
The 
of The 
THE 
|_| 


THE INVERSE NORLUND MEAN 405 


But 
|anp| = r(r — 1)7 — r(m — +m+1" 


< r(r — 1) + 1) < M, 
and 
lim dnp = rT (r + p — 1)(p!)>- lim m! + = 0. 


Therefore since || @n» || satisfies (2) and (5), the theorem is proved.” 

THEOREM 6: If >> u, is summable (C,r — 1)", r > 0, and if lim, 2 = l, 
then l = 0. 

Proor: If the hypothesis is satisfied, po un is summable (C, r)", as a result 
of Theorem 2. Theorem 6 then follows from Theorem 3, Cor. 1. 


THEOREM 7: The condition, lim,.. x” = 0, does not imply the convergence of 


te. 


ProoF: Let uw = 1, and let 


_ (m—1)!'S +m p) 
T(r + m) p=2 T(r)(m — p + 1)! log p’ 


where (r — 1)I'(r + m — p) is replaced by 1 when r = 1, p = m+ 1. Let 
= 0, r integral and S n, 


(r — + p—k — 1) 
I(r — k)k!(p — k)! log (n — p + 1)’ 


Um m=1,2,-:-, 


Sin = 1)" 


allother k. Then 


Letting k = p = 0 in (16), we find that the coefficient of [log (n + 1)]” is 
(I(r + 1]. For 1 S< x S p, we can easily prove that >°{=5 Sip = 
+ p — x — I1)[p(r — Therefore Sip = 
and > 20 Sip = 0, p = 1. It follows that x¥’ = [I(r + 1) log (n + 1)]"; 
therefore the hypothesis of the theorem is satisfied. But 


(m — 1)! (r — +m — p) 
T(r + m) log (m + 1) — p + 1)! 
= [(r + m — 1) log (m+ m = 1,2, 


therefore >. wu, diverges. 
TuroreM 8: If u, = Sn Smt+ a, k = 0, 1,---, where 
Pn = 0, mo = 0, me = Mea + Axa + 1; and of 


(17) du, is summable (C, r)", 7 = 2, 


10 Toeplitz, loc. cit., p. 115. 


|| 
1; 
-1 
the 
0. 
1en 


406 FLORENCE M. MEARS 


then >> un is summable | C, 0 | , provided there is a positive constant a for which at 
least one of the following conditions is satisfied: 


(18) x, < a, k=0,1,-+-, 
(19) < a, k=0,1,-- 


Proor: Using (17) and the corollary of Theorem 3, we find that x‘ and 
+0asn—> Therefore n(n — 1) | Un — Unt | as n— &, and we 
can find a positive integer N such that the following conditions are ‘ntialied 


when n > N: 
Pn S Pn + Pra = | Un — Una| < [n(n — 
Pn — S | Pn — = | Un — Una| < [n(n — Mm. 


Assuming (18) satisfied, and replacing by P,,, = 0,1,-:-, 
we have, when nq, > N 


co aak —m + 1 co afk a + 1 
P. n < < . 
Therefore > 70 P»,, converges, and since >, u» converges, > 
verges also. This completes the proof when (18) is satisfied; a similar proof 


holds for (19). 
3. Multiplication Theorems for the Nérlund Mean 


Let >> w, be the Cauchy product of the series > un and >> vn. Consider 
the Nérlund means, (N, a) and (N, b), defined by the sequences {a,} and {b,}. 
Let {c,} be the sequence defined by the equations b, = > io Gn—xCz ; We assume 
throughout this section that {a,} and {b,} are fixed sequences for which 
Dic = C, = 0. When a series >) vp is assigned, let b;, and c;,, be defined, 
for n = 0,1, , by by = and Cy, = 5 and let and 
c1=0. We shall make use of the following conditions: 


(20) >.>», summable (N, b) to V; 
(21) >>», summable | N, b| ; 


(22) lim Bz = 0, k=0,1,°°°3 
(24) <M, n, k = 0,1, 
(25) [c, Ba’ — < M, k = 0,1, 
(26) lim By! = 0, k=0,1, 


is transi 


Moo 
Colloquit 


| 

(29) 
to UV 
Let 
for n = 
oi The 
THE 
toUV 
it is ne 
THE 
whenev 
satisfy 
Pro 
is trar 

where 

(30) 

The mz 
To con 
(5); to 
Theore: 
toUV 1 
Un 80 
THEO 

to UV 
| mable | . 
THEO 
wheneve; 
satisfy ( 
PRoo: 


Jer 


ich 


ind 


THE INVERSE NORLUND MEAN 407 


n n—k 
(27) | Ba" | < M, n= 0, 1, 
n—k 
(28) Bat 2) Anpaby| < M, n,k = 0,1, +++; 
n—k 
9) [bp Bs — <M, k=0,1,--- 


THEOREM 9: In order that bi vn may be such that >> w, is summable (N, b) 
to UV whenever >~ un is summable (N, a) to U, it is necessary and sufficient that 
> vn satisfy (20), (22) and (23). 

Let (N, a) be regular, and assume that >>%_5 n | tn | < «©, where 2, is defined, 
forn = 1,2, --- , by = 0, and where = 1. This restricted form 
oi Theorem 9 has been obtained by C. N. Moore” as a special case of a more 
general theorem. 

TuEorEM 10: In order that >~ v, may be such that >> w, is summable (N, b) 
to UV whenever >» un is summable (N, a) to U and in addition summable | N, a | , 
it is necessary and sufficient that : Vn satisfy (20), (22) and (24). 

THEOREM 11: In order that >) vn may be such that >> w, is summable | N, b | 
whenever >} u, is summable |N, a|, it is necessary and sufficient that >> v, 
satisfy (25). 

Proors of Theorems 9, 10, and 11: The sequence {U’,}, into which {U,} 
is transformed by (N, a) is obtained from (8) when p, is replaced by 
a,,n = 0,1,---. The sequence {W,} is transformed by (N, b) into {W%}, 
where 


n k 
(30) = Bo Dd Up - 
k=0 p=0 


The matrix || cnx || , defined by = Bz’ transforms {U’,} into {W%,}. 
To complete the proof of Theorem 9, we require that || c,x || satisfy (2), (3) and 
(5); to complete Theorem 10, that || cnx || satisfy (2), (3) and (6); to complete 
Theorem 11, that || cnx || satisfy (7). 

TuEorEM 12: In order that >> v, may be such that >. wz is summable (N, b) 
to UV whenever > u, is summable (N, a) to U, it is necessary and sufficient that 
> vp satisfy (20), (26) and (27). 

TurorEM 13: In order that >. v, may be such that >> w, is summable (N, b) 
to UV whenever >> un is summable (N, a)’ to U and in addition sum- 
mable | N, a | —’, it is necessary and sufficient that >° v, satisfy (20), (26) and (28). 

TurorEeM 14: In order that >. v, may be such that >> w, is summable | N, b| 
whenever >. un is summable | N, it is necessary and sufficient that 
satisfy (29). 

Proors of Theorems 12, 13 and 14: The sequence {U,}, into which {U,} 
is transformed by (N, a), is obtained from (9) when p, is replaced by a, , 


4“ Moore, Summable Series and Convergence Factors, American Mathematical Society, 
Colloquium Publication, vol. 22, p. 44, Theorem II. 


at 

nd 

we 

ied 

Lk 

Uk. 

1) . 

on- 

oof 

| 
ed, 

_| 


408 ‘FLORENCE M. MEARS 


n = 0,1, --- ; {W%,} is defined by (30). The matrix || c, || , defined by c,, = 
_pl, transforms {U’,} into {W’,}. The remainder of the 
proof is the same as that of Theorems 9, 10 and 11. 

It is easy to prove that necessary and sufficient conditions that (22) follow 
from (20) when (N, a) = (C, 0) are that b,.Br4. — 0 and | B,Brix| < M, 
n, k = 0, 1, --- ; a necessary and sufficient condition that (24) follow fone 
(20) when (N, a) = (C, 0) is | B,Brtn | < M,n,k = 0,1, --+ ; a necessary and 
sufficient condition that (25) follow from (21) when (N, a) = (C, 0) is that 
(N, b) be absolutely regular. Using these facts we can prove the following 
theorems, which are special cases of Theorems 10 and 11. 

THEOREM 10’: Let (N, b) be regular. In order that }> Vv» may be such that 
> w, is summable (N, b) to UV whenever > u, is summable (C, 0) to U and in 
addition summable | C, 0| , it is necessary and sufficient that >> v, satisfy (20). 

TuEoreM 11’: Let (N, b) be absolutely regular. In order that >> v, may be 
such that >- w, is summable |N, b| whenever >> un, is summable | C, 0| it is 
necessary and sufficient that >> v» satisfy (21). 

The triangular matrix || cn» ||, defined by ¢np = pie — 
Pp = 0, 1, n; = O, transforms {B;' Bn_pvp} into 
{Bure Az It can be proved that necessary and sufficient 
conditions that || c,, || transform every convergent sequence into a bounded 
sequence are that (N, a) be absolutely regular, and that | B,Bri.| < M, 
n, k = 0, 1,--- ; ther@fore, under these conditions, (28) follows from (20). 
Since || ¢np || is absolutely regular if (N, a) and (N, b) are absolutely regular, 
(29) is a consequence of (21) if these conditions on (N, a) and (N, b) are satisfied. 
Similarly we can prove that (26) is a consequence of (20) if (N, a) and (N, b) 
are regular and (N, a) absolutely regsiar. From these facts we obtain the 
following special cases of Theorems 13 and 14. 

THEOREM 13’: Let (N, a) and (N, b) be regular, and (N, a) absolutely regular. 
In order that >> vn may be such that >> w, is summable (N, b) to UV whenever 
> un is summable (N, a)" to U and in addition summable | N, a |, it is neces- 
sary and sufficient that >> v, satisfy (20). 

THEOREM 14’: Let (N, a) and (N, b) be absolutely regular. In order that 
Dd rn may be such that >. w, is summable | N, b| whenever >> un is summable 
N, a |’ it is necessary and sufficient that vn satisfy (21). 

Comparing Theorems 10’ and 13’, we find that in order that >> w, may be 
summable (N, b) to UV when (N, a) and (N, b) are regular and (N, a) absolutely 
regular, it is necessary and sufficient that >> v, satisfy (20), not only if YS um 
is summable | C, 0| , but also if >> u, satisfies the more stringent condition of 
summability |, a|~*. A similar comparison of Theorems 11’ and 14’ shows 
that in order that >> w, may be summable | N, b|, when (N, a) and (N, }) 
are absolutely Bg it is necessary and sufficient that > », satisfy (21), 
whether >> u, be summable | N, a |~’, or merely summable | C, 0 | . 

However, if we do not impose additional conditions upon (N, a) and (N, 5), 
we can construct examples to show that the conditions imposed upon 


in The 
14 less 
= 1 
(28) a 
satisfie 
D un i 

Con 
sufficie 
9 when 
conditi 
9 when 


If we 
to zero 
and ab 
quence 
(26), ( 
+ 
and (3: 


(31) > 
(32) > 


| | 
> 
P= 
| 
| (35) 2 
(0) 
The fc 
THEO! 
toUVw 
Dd sa 
THEOR 


THE INVERSE NORLUND MEAN 409 


in Theorem. 13 are less stringent than those of Theorem 10, and those of Theorem 
14 less stringent than those of Theorem 11. For if we let (N, b) = (C, —1/2), 
v = 1,0 = 0,n = 1, 2, --- , we find that when (N, a)’ = (C, 1)”, (20), (26), 
(28) and (29) are satisfied, but when (N, a) = (C, 0), (24) and (25) are not 
satisfied. Therefore there is some >> u, summable | C, 0 |, such that >> w, = 
> un is not summable (C, — 1/2) for | C, —1/2 |], but there is no > un summable 
|C,1|~* such that >> w, is not summable (C, —1/2) [or | C, —1/2 |]. 

Conditions of regularity and absolute regularity of (N, a) and (N, b) are not 
sufficient to make the conditions of Theorem 12 equivalent to those of Theorem 
9 when (NV, a) = (C,0). If (N, b) = (C, 0), >o%-0 (—1)"(n + 1)” satisfies the 
conditions of Theorem 12 when (N, a) = (C, 1), but does not satisfy Theorem 
9 when (N, a) = (C, 0). : 


4. Multiplication Theorems for the Cesaro Mean 


If we let (N, a) = (C, r) and (N, b) = (C, s), where r and s are either equal 
to zero or have a real component greater than zero, (C, r) and (C, s) are regular 
and absolutely regular. It is easily proved that in this case (22) is a conse- 
quence of (20). Therefore we shall not need conditions corresponding to (22), 
(26), (28) and (29) for the Cesiro mean. In the following conditions, 
«I(x + y — 1) is to be replaced by 1 when x = 0, y = 1; s 2 r in (33), (34) 
and (35); and S(x, y) = + (a + = 0, 
S(z, -—1) = 0. 


(31) >> v, summable (C, s) to V; 
(32) >> v, summable | C, s| ; 


+ p+ 1)S(s — — p) _ 
(33) ETE <M, n=0,1,-:; 

+ p + 1)S(s — n — p) =0,1,---; 
(34) fe tat <M, n,k=0,1, 

E& —1r,p)_ (n—1)!8(s—1r,p— +n—-—pt1) 
(35) | p=o +n + 1) + n) (n — p)! 


<M, 


(36) 


k=0 


n—k nirT(r +n — k)(n p)!S(s, 
Bite <M, n= 0,1, 


The following theorems are special cases of Theorems 9 through 14. 

THEOREM 15: In order that : 3 v, may be such that bis w, ts summable (C, s) 
to UV whenever >> un is summable (C, r) to U, it is necessary and sufficient that 
Dv» satisfy (31) and (33).” 

THEOREM 16: In order that =. v, may be such that > w, is summable (C, s) 


® Moore, loe. cit., p. 46, Theorem VI. 


he 
m 
id 
at 
ng 
at 
mn 
). 
be 
is 

| 
to | 
nt 
b) | 
he | 
| 
er | 
ble 
Un 
of 
WS 
b) 
1), 
Vn 


410 FLORENCE M. MEARS 


to UV whenever >> un is summable (C, r) to U and in addition summable | C, r |, 
it is necessary and sufficient that >> v, satisfy (31) and (34). 

TuEorEM 17: In order that >) vn may be such that >) w, is summable | C, s | 
whenever >, un is summable | C, r|, it is necessary and sufficient that >~ v, sa- 
tisfy (35). 

TuEorEM 18: Jn order that >> v, may be such that >> w, is summable (C, s) 
to UV whenever >. un is summable (C, r)* to U, it is necessary and sufficient that 
> v, satisfy (31) and (36). 

TuroreM 19: In order that >> v, may be such that >> w, is summable (C, s) 
to UV whenever >> un is summable (C, r)~' to U and in addition summable | C, r | ~*, 
it is necessary and sufficient that >~ v, satisfy (31). 


TueoreM 20: In order that >) vn may be such that >) w, is summable | C, s|_ 


whenever >> un is summable | C, r | | at is necessary and sufficient that > Up 
satisfy (32). 


Tue GEORGE WASHINGTON UNIVERSITY. 


1. T 


where 
functio 
obtaine 
bigger 
the for 


ANNALS 0 
Vol. 44, N 
| 
e 
paper 1 
and the 
theoren 
proof v 
2. Tt 
satisfyi1 
(2) 
(3) 
where 
sequence 
for some 
(4) 
converge. 
having 
function 
1M. K 
28. Iz 
(1939), py 
3A. Ke 
Fundame 
4 See fc 
grafje Ma 
ples. 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


CONVERGENCE OF CERTAIN GAP SERIES 
By M. Kac 


(Received March 2, 1942) 
1. The present paper investigates convergence properties of series 


> x), 
k=1 


where {nz} is a gap sequence of integers (nx41/n, > q > 1) and ¢(z) a periodic 
function satisfying Hélder’s condition. Several results in this direction were 
obtained by the present writer’ and Izumi and Kawata.’ In both cases not only 
bigger gaps were considered but also the sequence {n;,} was assumed to be of 
the form 2”*, where {m;} was a further specified sequence of integers. In this 
paper nothing is assumed about the arithmetical structure of the sequence {n,} 
and the gap condition is the one to be hoped for by analogy with a corresponding 
theorem of Kolmogoroff*’ concerning trigonometrical series. In concluding the 
proof we follow a very ingenious idea of Paley and Zygmund.* 


2. THEOREM. Let g(x) be a complex-valued function, defined over <x < ~, 
satisfying the conditions 


(2) lez’) — |S <2,2"< @ 
N 

[ewar =o 

where , M, and a are constants for which} > OandO<a<Zi. Let {nx} bea 

sequence of integers satisfying the gap condition niii/n, > g > 1,k = 1,2,---, 


for some constant g. Then if > fn | cx |? < ©, the series 
k=1 


converges in the mean with exponent 2 over each finite interval to a function f(x) 
having period \ and, moreover, the series (4) converges almost everywhere to this 
function f(x). 


1M. Kac. Sur les fonctions indépendantes V, Studia Mathematica, 7 (1938), pp. 96-100. 

2§. Izumi and T. Kawata, On certain series of functions, Tohoku Math. Journal, 46 
(1939), pp. 91-105. 

® A. Kolmogoroff, Une contribution a l’étude de la convergence des séries de Fourier, 
Fundamenta Math. 5 (1924), pp. 96-97. 

4 See for instance S. Kaczmarz and H. Steinhaus, Theorie der Orthogonelreihen, Mono- 
grafje Matematyczne, pp. 126-127 and 137-138, where the method is explained in two exam- 
ples. 


411 


412 M. KAC 


A linear change of independent variable reduces the general case to that in 
which } = 27; we hereafter take \ to be 27. If g(x) = ¢i(x) + igo(x) where 
¢i(x) and go(x) are real, then ¢:(x) and g2(x) each satisfy the conditions imposed 
upon ¢(x); it is therefore easy to see that the theorem follows when we prove it 
for the case in which g(x) is real. Likewise, it is sufficient to establish the result 
for the case in which the constants c;, are real. If the conditions on g(x) are 
satisfied when a = a2 and if 0 < a, < a, then the conditions are satisfied when 
a = a. Hence we may (and shall) assume that 0 < a < 1. 

Lemma 1. There is a constant A independent of j and k such that 


Let g(r) ~ >°%_1 (a, cos nx + b, sin nz), let s,(x) denote the n* partial sum 
of this series, and let o,(x) denote the n* Fejér trigonometric polynomial of 
g(x). By a theorem of S. Bernstein,’ there is a constant D such that 


| — on(x) | Dn™ 
hence 
l=n+1 
Qa 
< [ | o(x) — on(x) dx < Dn™. 


Use of Parseval’s relation gives 


1 2a 
x)p(n, x) dz = + b, b,). 
Assuming that 7 < k, we see that if positive integers r and s satisfy the equality 
m; = sn,, then s 2 1 and r = (n;/n;); hence use of the inequality 
| a,a, + b,b.| (a? + + 
and of the Schwarz inequality gives 


r>nz/nj o= 


< Bim/n)y* < Bq"). 


This establishes Lemma 1. 
Lemma 2. There is a constant C, independent of a, b, and k, such that 


b . 
de} Oni" Osa<bS2r,k =1,2,-:. 


5 §. Bernstein, Sur l’ordre de la meilleure approximation des fonctions continues par des 
polynomes de degré donné, Memoires de ]’Académie royal de Belgique, IV, pp. 1-104; in 
particular p. 88. See also A. Zygmund, Trigonometrical series, Monografje Matematyczne, 


p. 62. 


Then, 


and Len 

Let f( 
tion has 
to be suc 


almost a 
by provi 
Since 


Whe 
| 
1 2a 
The co 
Lem? 
converge 
n 
| 


CONVERGENCE OF CERTAIN GAP SERIES 413 
When 0 < x < 2z, let ¥(x) be the characteristic function of the interval 
a <x b;and let be extended over the infinite interval so that + 2x) = 
v(x). Let 
v(x) ~ 3do + (d, cos nx + Sin nz). 
Then, by Parseval’s relation, 
1 


and therefore 


¥(x)o(m, x) dx 


(a; + dim + cin) | 
< [> (a? +8 


The conclusion of Lemma 2 follows. 
3. The series 


> Ceo 2) 


converges in the mean, with exponent 2, overO S x S 2. 
If m and n are integers for which 1 S m < n, then 


2») dx = i Cj Ck o(n;x)o(n, x) dx 


| c;| | cx | leelle| 
<A =< 2A 
| n 1 n 
<2 = 2A 8 


and Lemma 3 follows. . 

Let f(x) be the function to which }\c,o(n.x) converges in the mean; this func- 
tion has period 27 and belongs to class Lz over the interval 0 < x S 2x. Let 
to be such a point that 


toth 
filo) = tim fle) aes 


almost all points ¢ have this property. We complete the proof of the theorem 
by proving that the series (4) converges to f(t) when x = th. 
Since > cxp(ngx) converges in mean to f(z), 


toth toth 
f(x) dx = o(m x) dx; 


to k=1 


414 M. KAC 
using this fact and Lemma 2, we obtain 


toth toth 
f(x) dx — o(m x) 
k=1 to 


to 


Use of (2) gives 


r toth 
Ck [o(mex) — o(nxto)] dx 


T 
< lal ng. 
k=1 


Combining these inequalities gives 

f(x) dx »» Ci|h| >, n, + | cy | ng. 
0 

Let h, = n>". Then 


a a . a 1 
[he |* Do | me = | ce = — | 
k=l k=l k=1 


and since c, — 0 as k — ©, the elementary theory of matrix transformations 
implies that the last member converges to 0 as r — ~. It now follows easily 
that 
1 toth, r 

lim f(x) dx — = 0 

h, to k=1 
and hence that >-c.(nsto) converges to f(t). This completes the proof of the 
theorem. 


3. By analogy with some former results (loc. cit. 1 and 2) one might expect 
that under the conditions of our theorem the divergence of 


(tn +0 as n— @) 
k=1 


‘would imply the divergence almost everywhere of the series 
(a) 2). 
k=1 


This can be easily disproved by the following simple example. 
Let ¥(x) satisfy the conditions (1), (2) and (3) of our theorem and put 


o(x) = ¥(x) — ¥(2z). 


Put n, = 2" and choose {c,} in such a way that 


(b) = 


| 


CONVERGENCE OF CERTAIN GAP SERIES 415 
and 


(c) | — < 
k=1 


(for instance c, = 1/ Vk). We have 


The right side tends to a limit for every x, hence so does the series (a). This 
combined with (b) disproves the conjecture. 


CoRNELL UNIVERSITY. 


Ss 
y 
t 
) 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


ON THE EXPANSION OF THE PARTITION FUNCTION IN A SERIES 


By Hans RaDEMACHER 
(Received April 8, 1943) 


1. A geometric property of the Farey series, discovered by L. R. Ford (1) 
is used in this note for the construction of a new path of integration to replace 
the circle carrying the Farey dissection, first introduced by Hardy and Rama- 
nujan in their classical paper (2). This new path of integration will bring about 
an essential simplification in the treatment of the partition function and, in 
gerieral, in the determination of the coefficients of modular functions of non- 
negative dimension. It seems to me that the new path exhibits more clearly 
than the Farey ares do the different contributions of the approximation functions 
near the roots of unity. Moreover, only two estimations have to be performed, 
and they are direct consequences of the obvious statements (3.2) and (4.1) 
concerning the circle over the diameter 0 to 1. 

Ford’s theorem referred to above can be enunciated as follows: 

If in a complex 7-plane we mark the points corresponding to the reduced 
fractions h/k and draw about the points 

h t 
Thk = k + oe 
the circles of radii 1/2k°, which touch the real axis at h/k, then these circles do 
not intersect. Two of them are tangent to each other if and only if their frac- 
tions h/k and l/m appear as neighbors in a Farey series of some order. 

The proof is clear. Comparing the distance of the centers with the sum of the 

radii of two such circles we consider 
1 (km — Ik)? 
| the — Tum| — + = ( > 0. 


The equality sign, indicating contact of the circles, is attained only for 


hm — lk = +1, 


and this would mean that the fractions h/k and //m are neighbors in some 
Farey series, e.g. that of order N = k + m — 1. 

To each positive integer N we introduce now a path Py which we shall later 
use for the complex integration. Let c,,, be the circle of Ford’s theorem belong- 
ing to the (reduced) fraction h/k. We draw all circles c,,, for k < N, 
0 < h/k <1, in other words all the circles belonging to the Farey series of order 
N. If hi/ki < h/k < he/ke are three adjacent fractions of that series, then 
the circle c,,, has a point of contact with cp,,., as well as with ca,.,. These 
points of contact cut c,,, into two arcs, an upper one and a lower one. (The 
lower one touches the real axis.) As path Py we take now the row of upper 
ares Yr,x, each traversed on its circle in the negative sense, on cp,. therefore from 

416 


the poi 
shows 4 
it does 
ya Whi 

For : 
They a 


(1.1) 


and 
(1.2) 
Incident 


h/k); th 
fraction: 


1 The 


Thus 
radius ex] 
N in my ; 


lo 


he 


EXPANSION OF THE PARTITION FUNCTION 417 


the point of contact with cy,, to the point of contact with ¢,,,.,.° The figure 
shows Py for N = 3. Because of the periodicity of the function to be integrated 
it does not matter that instead of the whole are yo, we have drawn a part of 
ya Which is obtained from the omitted part of yo, through the translation +1. 

For a later purpose we need the coordinates of the endpoints of the are 7p... 
They are, a8 simple geometric arguments show, 


i 
(1.1) ( KP +R PTR 


the 


t 


Fig. 1 
and 
h kp 


Incidentally, the point h/k + ¢1.: lies on the semicircle over the diameter (hi/ky , 
h/k); the whole path Py lies above the row of semicircles connecting adjacent 
fractions of the Farey series of order N. 


1 The imaginary part of ¢,,, is 1/(k? + k?), and we have 


1 1 2 2 


2N2 ~ ki 
Thus $(¢n,,) is of the order N-%. This corresponds to the choice of a circle of 
radius exp (—2xN~-?) as the path of integration associated with a Farey dissection of order 
N in my previous treatment of p(n). (3). _ 


| 
1) 
ce 
ut 
in 
n- 
ly 
ns 
d, 

1) 
ad 
_| 
_| 
ne 
er 
g- 
V, 
er 
on 
se 
he 
er 


418 HANS RADEMACHER 
2. In order to come to p(n) we start with Euler’s formula 
(2.1) 1+ 2 = I] 2")* = <1. 
We have therefore 


for any path of integration in the upper r-halfplane connecting 7 and 7 + 1, 
We choose the path Py described above and obtain 


p(n) >. / f( en dr 


O<h<ksn 


Here and in the sequel it is always understood that h is prime to k. The path 
of ¢ between ¢,. and ¢j. is described by the substitution 


where 7 runs on y,,,. That means that ¢ runs from ¢,,x to ¢;. on the upper are 
of a circle of radius 1/2k’ about the point 1/2k’. Introducing in each integral 
a new variable z by 


we obtain 
(2.2) p(n) = pe fle Je * dz. 
Sh<ksNn 2h 


Here z runs in each integral on an arc of the circle K of radius } about the point 
4 as center. The ends of the arc are 


” 


or, according to (1.1) and (1.2) 


ke kk? 


= 

The points Zhok and z,, divide the circle K into two ares; that one is meant as 

path of integration which does not touch the imaginary axis. We have 9(z) > 0. 


On 
functi 
functi 


Here « 
for th 
from ( 
(2.4) 
with t 
(2.5) 


We rev 


(2.6) 


The pa 
the cho 


(3.2) 
This yi 
(3.3) 


On the 


| zine | a 


| | 


EXPANSION OF THE PARTITION FUNCTION 419 


On the integrands in (2.2) we apply now the transformation formula of the 
function f(x), a formula, which stems from the theory of the elliptic modular 
functions (cf. (2), Lemma 4.32) 


Here w,,x is a well-known root of unity, and h’ is a solution of the congruence 
hh’ = (mod k); 


for the square root the principal branch has to be taken. We get therefore 
from (2.2) 


O<A<KSN 


with the abbreviation 


(2.5) V,(z) = z' exp (5. - 


We rewrite (2.4) as 


2rihn 
O<h<kSN 
2rth’ 
OSh<ksNn 

_2rihn 

= tktome Int © Ihe, 
OSh<kSN OSh<kSN 


where I;,,, and I7; are respectively abbreviations for the integrals. 


3. We estimate first 


k 
The path of integration, which is an arc of the circle K, can here be replaced by 
the chord s,., from z),x to zz. We have on and in the circle K 


(3.2) 0 <R() <1, > 1, 
This yields the estimate 

Qrih’ 2Qrnz 1 


On the chord Sh,k We have | z| less than or equal to the greater of the numbers 
and |z,,|. From (2.3) we derive readily 


420 HANS RADEMACHER > 


Now 
Vie +h = 24k +m) = + 1) 


since k and k, are the denominators of neighbor fractions in the Farey series of 
order N. Therefore we have on s;,x 


(3.4) |z| S$ 2*k(N + 1)”. 
The length of the chord s,,; is, according to (2.3), 


ste) 


k\ ky — ke| 2k | ki — ke| 
(3.5) 4+ +! b+ 


From (3.1), (3.3), (3.4) (3.5) we obtain now 
| < 


where the constant C contains n, which however we keep fixed. This estimate 
leads to 


2rihn 


O<h<ksNn 


We can thus replace (2.6) by 


Qrihn 
(3.6) p(n)= >> tk bone Ine + O(N), 
ait, Qanz 


The = W,(z)e ** dz. 


Zhk 


4. In In we introduce now the whole circle K, from 0 around in the negative 
sense to 0, as path of integration: 


2anz 2rnz 


0 
Ihe = dz — dz — / Wi(z)e * dz. 
2h k 


We estimate the last two integrals. Since they are of the same type we need 
to consider only the first. On the circle K we have 


(4.1) =1, 0<(@) <1. 


The : 


and ( 
there 


Inser 


p(n) 


with 


(4.2) 
If we 
infinit 


(4.3) 


The in 


| 
|| 
5. 
= 

axis tl 
| 


EXPANSION OF THE PARTITION FUNCTION 421 


The arc from 0 to o- is less than” 
5 < n27*kN , 


and (3.4) is also valid on that arc. Remembering the definition (2.5) we obtain 
therefore 


Inn = [ de + OWN, 
Insertion of this into (3.6) yields 
K(~-) 


O<h<cksNn 
+ 
O<h<k<N 


1sksNn 
with the abbreviation 
_2rthn 
(4.2) A;(n) = Wh 
h mod k 
(hk) =1 


If we now let N tend to infinity the error term goes to zero, and we obtain an 
infinite convergent series 


(4.3) p(n) = i >, gh (* ay. 
k=l K(-) 
5. In order to carry out the integral we substitute 
1 
w=-. 


The path of the integral in the w-plane is then the line parallel to the imaginary 
axis through the point 1. Therefore we have 


The integral here is brought into a known form by the substitution 


422 HANS RADEMACHER 
which yields 
We find’ thus 
p(n) = 2x(24n — 7, — 
mt k 
where J; is a ‘‘Bessel function of imaginary argument.” 


Finally, Bessel functions of half odd order can be reduced to elementary 
functions. In our case we apply the relation 


Using the abbreviations 
C= 


we obtain then the result (3) 


pln) = Aula) 


which we had set out wa prove. 


UNIVERSITY OF PENNSYLVANIA. 


REFERENCES 


(1) L. R. Forp, Fractions. American Mathematical Monthly, vol. 45, (1938), pp. 586-601. 

(2) G. H. Harpy anp S. Ramanusan, Asymptotic Formulae in Combinatory Analysis. 
Proc. London Math. Soc. (2), vol. 17 (1918), pp. 75-115; also Ramanujan’s Col- 
lected Papers (1927), pp. 276-309. 

(3) H. RapemacHEr, On the Partition Function p(n). Proc. London Math. Soc. (2) vol. 
43 (1937), pp. 241-254. 


2 Watson, Bessel Functions, p. 181, formula (1), where, however, the path of integration 
is bent into a loop around the negative real axis; compare the remark to formula (8), p. 177. 


Vol. 44, 


Co 


de va 
loi de 


en po 


On vi 


L’étu 
Beauc 
Fréch 
un ch 

Cet 


et 


1En 
des loi: 


ANNALS 
| 
Sl 
= 
I;(z) = (a) . 

En 
se réd 
R. M 
distril 
santes 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


SUR LA DISTRIBUTION LIMITE DU TERME MAXIMUM D’UNE 
SERIE ALEATOIRE 


Par B. GNEDENKO 
(Received February 8, 1943) 


Introduction 
Considérons une suite 


de variables aléatoires mutuellement indépendantes et assujetties 4 une méme 
loi de distribution F(x). Formons une autre suite de variables aléatoires 


en posant 
= max 22, Xn). 
On voit facilement que la fonction de distribution de é, est 
= < x} = 


L’étude de la fonction F(x) pour les grandes valeurs de n offre un intérét notable. 
Beaucoup de travaux ont été consacrés 4 cette question. En particulier, M. 
Fréchet [1] en 1927 a trouvé les lois qui peuvent étre limites pour F,,(a,7) pour 
un choix convenable des constantes positives a, . 

Cette classe de lois limites est formée des lois de types’ suivants 


_ J0 pour 
Pa(z) = pour 
et 
—(—2z)* 
_ Je pour zs0 
¥a(z) pour z>0 


ou a désigne une constante positive. 
En 1928 R. A. Fisher et L. H. C. Tippett [2] ont établi que les lois limites pour 
F,(anz + bn), od, > O et b, sont des constantes réelles convenablement choisies, 


se réduisent aux lois de types ®.(x), Ya(x) et la loi 
A(x) 


R. Misés [3] qui a commencé une étude systématique des lois limites pour la 
distribution du terme maximum vers 1936, a trouvé plusieurs conditions suffi- 
santes pour la convergence des lois F,(an2 + b,) vers chacun des types cités 


1 En imitant A. Khintchine et P. Lévy nous appellerons type de la loi (x) l’ensemble 
des lois (az + 6) pour tous les choix possibles des constantes réelles a > 0 et b. 


423 


424 B. GNEDENKO 


4 l’instant, pour un certain choix fixe de constantes a, > 0 et b, ; en terminant 
le dernier paragraphe de ce travail nous aurons |’occasion de formuler la condi- 
tion suffisante de convergence vers la loi A(x) trouvée par R. Mises. 

Cependant les travaux cités ne donnent pas de solution compléte des problémes 
fondamentaux sur les distributions limites du terme maximum de la série aléa- 
toire. Et, tout d’abord, il reste 4 rechercher le domaine d’attraction pour 
chaque loi limite propre’ (zx), c’est-d-dire l’ensemble de toutes les fonctions de 
distribution F(x) telles que, pour un choix convenable des constantes a, > 0 
et b, , on ait 

lim + bn) = 
Il est intéressant de remarquer que non seulement la position des problémes 
sur les dirtributions des termes maxima mais, comme nous allons le voir, les 
résultats obtenus offrent une grande analogie avec les problémes et les résultats 
correspondants concernant la théorie des lois stables pour les sommes de variables 
aléatoires indépendantes (Voir, par exemple, Chap. V [4] et (5)). 

Il sera prouvé dans ce travail que la classe des lois limites pour les maxima 
n’est formée que des types indiqués. Nous donnons aussi les condition néces- 
saires et suffisantes pour les domaines d’attraction de chacun des types limites 
possibles. Cependant si les conditions trouvées pour les lois ®,.(x7) et WVa(x) 
peuvent étre regardées comme définitives, il n’en est pas ainsi pour les conditions 
concernant la loi A(x); ces conditions-ci n’ont pas encore recues, 4 notre avis, 
de forme définitive et commode pour les applications. Au §1 sont données 
les conditions nécessaires et suffisantes pour la loi des grands nombres et pour la 
stabilité relative des maxima. II est 4 remarquez que les lemmes 1 et 2, comme 
il nous semble, présentent un intérét par eux-mémes et peuvent étre utiles dans 
les recherches sur d’autres problémes limites. 

On voit facilement que les resultats que nous allons exposer s’étendent aussi 
4 la distribution du terme minimum de la série aléatoire. II suffit de remarquer 
que si 


In), 


Qn = min (a1, 


alors 


= max (—%, —%2, 


1. La loi des grands nombres 


Nous dirons que la suite de maxima 


(1) 


2 Une fonction de distribution s’appelle impropre ou unitaire si elle appartient au type 


0 pour z<0 
ez) = 
1 pour z>0 


nous 


D’un 
est re 
En 


et, pu 


nous 


d’une 
(2) 
est as 
a 
(3) 
pour 
La 
conve 
(4) 
a lieu 
Si | 
de ce 
(5) 
pour 
En 
Et, p 
Il est 
No 


DISTRIBUTION LIMITE 


d’une série de variables aléatoires mutuellement indépendantes 


est assujettie 4 la loi des grands nombres s’il existe des constantes A, telles que 
ait 
(3) P{|& — An| < e} 


pour n — © et tout « > 0 donné d’avance. 
La suite des maxima (1) sera appellée relativement stable si, pour un choix 
convenable de constantes positives B,, , la relation 


(4) 


a lieu pour tout « > 0. 
Si la fonction de distribution F(x) des variables aléatoires de la suite (2) jouit 
de cette propiété qu’il existe une valeur 2» telle que l’on ait 


(5) = 1 et Flm — ©) <1, 


pour tout ¢ > 0, alors la suite (1) est assujettie 4 la loi des grands nombres. 
En effet, les conditions (5) ayant lieu, nous avons 


P{| — | < = 1 — — €-) = 1 — F"(m — 
Et, puisque pour tout e > 0 
lim F" (2 — €) = 0 


nous avons pour tout e > 0 
lim P{|& — to| < e} = 1. 
D’une facon analogue, si les conditions (5) ont lieu et si z) > 0 alors la suite (1) 


est relativement stable. 
En effet, dans ce cas nous avons 
En 1 


Zo 
et, puisque d’aprés (5) on a pour n > © 

F"(a(1 — €)) 0, 


nous obtenons, pour n > ©, 
1. 


Il est évident que dans le cas 2 < 0 la stabilité relative ne peut plus avoir lieu. 
Nous voyons ainsi que toute la difficulté concernant la recherche des condi- 


<«b=1- Fett 0) 


425 

int 

di- 

1es 

yur 

de 

tim 1] < «b= 1 

les 

ats 

les 

es- 

(x) 

ns 

‘is, 

Ses 

| 

me 

ns 

ssi 

ler 

ype 


426 B. GNEDENKO 


tions sous lesquelles ont lieu la loi des grands nombres et la stabilité relative des 
maxima, se rapporte aux distributions donnant lieu a l’inégalité F(x) < 1 pour 
toutes les valeurs finies de z. 

THEOREME 1. Pour que la suite (1) soit assujettie a la loi des grands nombres, 
en supposant F(x) < 1 pour toute valeur finie de x, il faut et il suffit que l'on ait 


lim 1 — F(x + e) 
(6) Fe) 


= 0 
pour tout e > 0. 
D&monstRATION: En vertu de l’égalité évidente 
P{|& — An| < = F"(An +) — — €) 


les conditions pour la loi des grands nombres peuvent étre exprimées sous la 
forme suivante: pour tout e > 0, on a 


F"(A, +61, 
F"(A, — 30 


pourn— ©, 

De la premiére de ces relations il résulte, en tenant compte de la condition 
du théoréme, que A, — pourn 

Les relations trouvées sont équivalentes aux conditions suivantes: 


nlog F(An + 
n log F(A, — —@ 
pour n — ©; or, puisque 1 — F(x) pour x et puisque sous cette condi- 
tion 
log F(x) = log (1 — (1 — F(x))) = —(1 — — 401 — F(@))’ 
= —(1 — F(z))(1 + 0(1)), 
les conditions en question sont équivalentes aux relations suivantes 


n(l — F(A, + 


n(l — F(A, — €)) 


pourn — ©, 

Supposons maintenant les conditions du théoréme vérifiées et faisons voir 
que la loi des grands nombres a lieu. A ce but, définissons les constantes A, 
comme les plus petites valeurs de x donnant lieu aux inégalités 


(8) F(x -0)S$1-——S +0). 


sie 


En vertu de l’hypothése faite sur F(x) dans l’énoncé du théoréme il est évident 


nous €1 
on a 


(11) 


Or, no 

Sup) 
dire su 
(10) et 
(6) a | 

Il es 
suppos 
grande 


Il est 


ont lie 


que A, 
tous le 
pour 7 
(9) 
Il résu 
et par 
De la | 


DISTRIBUTION LIMITE 427 


que A, — pourn— D)’aprés la condition du théoréme, nous avons pour 
tous les et > > 0) 


1- F(A, +6) 1 F(An + ©) 9 


pour — ©; or, 7 > 0 étant arbitraire, nous en concluons que pour n > © 
1 — F(A, + 1 — F(A, + 


(9) i—F4,—-0) 
Il résulte de (8) que 
1 — F(A, + €) 1 — F(A, + «) 
< 
et par conséquent, en vertu de (9), nous avons pour n — © 
(10) — F(A, + — 0. 
De la condition du théoréme il résulte que, pour tout « > 0, 
lim _ 0; 


ewe | —F(x—«) 
nous en tirons par des raisonnements analogues aux précédents que, pour n > ‘ . 
on a 
(11) n(1 — F(A, — €)) 


Or, nous avons vu que la relation (3) résulte de (10) et (11). 

Supposons maintenant que c’est la loi des grands nombres qui a lieu, c’est-a- 
dire supposons qu’il existe une suite de constantes A, telle que, les conditions 
(10) et (11) soient vérifiées pour tout « > 0. Démontrons alors que |’égalité 
(6) a lieu elle aussi. 

Il est évident que d’aprés (10) on a A, — © pour n — ©, et nous pouvons 
supposer que les A, sont non décroissants. Pour tout valeur suffisamment 
grande de x, nous pouvons trouver un nombre n tel que 


Avi S2tsA,. 
Il est évident que les inégalités 
1 — — 21 F@-—) 21- F(A, — 9), 
1— +9) 21-F@+) 21- FA, + 9) 
ont lieu pour tout 7 > 0, aussi bien que les inégalités 


1 F(Ana +2) 1- F@ +n), 1- F(A. +2) 
~1—F@—1) 


| 
a 


428 B. GNEDENKO 


Il résulte de (10) et (11) que 
_ _ 


En remplagant x — 7 par x et 27 par e nous obtenons la condition du théoréme, 
THEOREME 2. Pour que la suite (1) soit relativement stable, en supposant 
F(x) < 1 pour toute valeur finie de x, il faut et il suffit que la relation 


1 — Pz) 
(12) 


0. 


0 


ait lieu pour tout k > 1. 
DémMonsTRATION: En tenant compte de l’égalité évidente 


B, 
nous pouvons écrire les conditions de la stabilité sous la forme suivante: pour 


< = F"(B,(1 + — F"(B,(1 


F"(B,(1 + > 1, 
Par des raisonnements analogues 4 ceux que nous avons employés dans la 


démonstration du théoréme précédent, nous voyons que ces conditions sont 
équivalentes aux suivantes: pourn — © ona 


(13) nil — F(B,0 + 0 

(14) n(l — F(B,(1 — > 

Supposons d’abord que la condition du théoréme soit vérifiée. Définissons B, 
comme la plus petite valeur de x donnant lieu aux inégalités 


(15) F(e(1 — 0) $1 + +0), 


En vertu de l’hypothése faite sur la fonction F(x) nous concluons que B, > « 
pour n ©, 
De (12) il résulte que pour tous les € et n (e > » > 0) ona 


1 — F(B,(1 + : 1 — F(B,(1 — 


pourn— ©, Or, puisque « > 0, nous en tirons, pour n — © 


1 — F(B(1 + €)) 4 1 — + €)) _, 
1—F(B(1+0)) 


De l’inégalité (15) nous concluons que 


— 0 


(16). 


1 — F(B,(1 + ¢)) 1 — F(B,(1 + 


et, par 
(17) 


pour 7 
€ > 0, 


nous et 
(18) 


pour 7 
Mais 


et (18) 

Sup} 
conséq' 
a lieu 
relatiot 


Nous 
toute v 


llest & 


dot I’ 


Il résul 


1 — F(B.(1_+ — F(Ba(1_+ )) 
Con: 
= 
20 


DISTRIBUTION LIMITE 429 


et, par conséquent, en vertu de (16), que 
(17) — F(B,(1 + 0 


pourn— ©. Or, puisqu’il résulte de la condition du théoréme que, pour tout 
«> 0, ona 


1 — F(x(1 — _ 

nous en obtenons, par des raisonnements analogues aux précédents, 
(18) n(1 — F(B,(1 — €))) > 


pour 
Mais, comme nous le savons, la stabilité relative des maxima résulte de (17) 


et (18). 

Supposons maintenant que les maxima sont relativement stables et que, par 
conséquent, les relations (13) et (14) ont lieu. Montrons qu’alors l’égalité (12) 
a lieu elle-aussi. Du fait que F(x) < 1 pour toute valeur finie de x et de la 


relation (13) il résulte que 
pour n> 


Nous pouvons évidemment supposer que les B, ne décroissent jamais. Pour 
toute valeur sufisamment grande de x nous pouvons trouver un entier n tel que 


Il est évident que pour tous les e et 7 > 0 nous avons 
1 — — n)) 2 1 — F@(1 — 21 — — »)), 
1— + 21- F@ +6) + 2), 
d’ot Von tire 
1— +6) +6), 1- FB + 
1 — F(B,(1 — 9)) ~ 1 — — ~ 1 — — 
Il résulte de (13) et (14) que pour tout e > 0 et tout 7 > 0a lieu l’égalité 
1 — F(x(l +) _ 
l+e 


bo? nous obtenons la condition (12). 

Considérons a titre d’example les fonctions de distribution suivantes: 
0 pour 


(19) 1, 


En posant X = 2(1 — n), k = 


(20) 


0 pour «30 
» 


430 B. GNEDENKO 


En vertu de 
1- Fi(x + 
lim 1 — F,(kx) _ 1 


zoo 1— Fi(x) 


nous voyons que pour la fonction de distribution (19) les maximas ne sont pas 
assujettis a la loi des grands nombres et ne sont pas stables, et cela quel que soit 
a > 0, tandis qu’en vertu de 


0 pour a>l, 


lim e pour a=1l, 
1 pour a<l, 
1- F.(kx) 


nous voyons que pour les lois (20) la stabilité relative a bien lieu pour toutes les 
valeurs de a, et que la loi des grands nombres n’a lieu que pour a > 1. 

On vérifie facilement que 1) pour la loi de Poisson les maxima sont relative- 
ment stables mais ne sont pas assujettis 4 la loi des grands nombres, 2) pour la 
loi de Gauss & dispersion ¢gale a un et 4 valeur moyenne nulle pour n — ~ ont 
lieu les relations suivantes 


En 
— V2 logn| < e} 1 
pour tout e > 0. 

En 1932 Bruno de Finetti [6] a donné quelques conditions pour I|’applicabilité 
de la loi des grands nombres. Finetti considérait des variables aléatoires ayant 
densités f(x) = F’(x) et assujetties 4 certaines conditions supplémentaires; la 
condition suffisante trouvée par Finetti est exprimée par |’égalité 


lim f (z + €) = 

zo f(z) 
pour tout e > 0. La condition de Finetti résulte facilement du théoréme 1 (et 
cela sans condition supplémentaire imposée aux variables aléatoires). En 
effet, en admettant l’existence de la dérivée f(x) = F’(x) pour toutes les valeurs 
de x, on trouve par la régle de |’ Hospital ‘ 


3 Lorsque j’ai démontré ce théoréme, les résultats de Fischer et Tippett exposés dans 
leur travail cité [2] m’étaient inconnus. Puisque la démonstration de ces auteurs n’est 
pas, 4 mon avis, suffisamment détaillée et fait appel 4 l’hypothése supperflue de l’analycité 
des quantités a, et b, relativement a la variable n, j’ai pensé qu’il serait utile d’exposer 
dans ce travail les résultats de ce paragraphe avec tous les développements nécessaires. 


dou il r 
de la lin 


pour to 
Il est 
stabilité 


THEO 
des con: 
V(t), 

Démc 
on ait 


pour n 
(21) 

a lieu pe 
des fon 


théorém 
au mém 


(22) 
ow les a 


Il rést 
4 l’égali 


(23) 


Considé 
1) a 


nous av 


DISTRIBUTION LIMITE 431 
f(e+e) 
= 


d’ou il résulte, en vertu du théoreme 1 de ce travail, que dans le cas de l’existence 
de la limite 


pour tout « > O la condition de Finetti est nécessaire et suffisante. 
Il est évident qu’on peut énoncer une condition analogue concernant la 
stabilité relative des maxima. 


2. La classe des lois limites 


THEOREME 3. La classe des lois limites pour F,(a,% + bn) od dn > O et bn sont 
des constantes convenablement choisies, ne contient que les lois des types ®,(x), 
¥,(x), A(x).° 

DEMONSTRATION: Supposons que pour un choix de constantes a, > 0 et b, 
on ait 


+ bn) = + bn) B(x) 
pourn — Alors, l’égalité 
(21) lim + bu)]* = 


a lieu pour entier k > 0. Ilen résulte que, k étant constant et n — ©, la suite 
des fonctions F”(a,,.2 + 6,,) tend vers une fonction limite. En vertu d’un 
théoréme de A. Khintchine ([4], théoréme 43) cette fonction limite doit appartenir 
au méme type que la fonction ®(x), c’est-a-dire que nous devons avoir 

(22) lim + baz) = + Bx), 


ou les a, > O et 8, sont constantes. 
Il résulte de (21) et de (22) que pour tout nombre naturel k la loi limite satisfait | 


4 Pégalité 
(23) (cyt + By) = B(x). 


Considérons séparement les trois cas suivants. 
1) a, < 1 pour une certaine valeurdek > 1. Pour 


hous avons 


f(z +e) 
Be 
ane + Be S 


432 B. GNEDENKO 


Done, la fonction 6(x) étant monotone, nous pouvons écrire 
+ Br) S P(x). 


Il en résulte que pour la fonction de distribution &(x) l’égalité (23) ne peut avoir 
lieu que si 


Be 


1— a 


= 1 pour 


Montrons maintenant que pour x < 6;/(1 — ax) on doit avoir 
<1. 


Supposons le contraire, c’est-a-dire admettons qu’il existe une valeur 4 < 
B:./(1 — ax) donnant lieu a l’inégalité 


(24) P(x) = 1. 
I] est évident qu’il est toujours possible de choisir pour tout x < x un entier n 
tel que 
Lo S age + +o +--+ + a7’). 
Alors, en vertu de (24), on doit avoir 
+ + a + = 1. 

Or, il résulte de (23) que 

"(ap + + + + 
(25) = + Be(l + a + + af *)} + 

= (af + + ox + + = 
c’est-a-dire que 
= 1 

pour toute valeur de x, ce qui est impossible. 


Nour voyons ainsi que (x) — 1 pour x 2 8;/(1 — ax) et P(x) < 1 pour 
< — ax). 

Montrons maintenant que si (x) est une loi propre et si a, < 1 pour une valeur 
de k, cette égalité a lieu aussi pour toutes les valeurs de k. Admettons qu'il 
existe un nombre r > 1 donnant lieu 4 l’inégalité a, = 1. 

Si a, = 1, nous avons pour toutes les valeurs de x, ®’(x + 8,) = (x) et, par 
conséquent, = @(x — 8,). il résulte en particulier que 


r Bx Be 
+2) = (*..) - 


(26) 


Si 
et il ré 
a(x) = 


et B(x) 
— 


nous vo 
Si a, 


et, par | 


D’ou, e1 
(27) 


Soit x < 


Conforn 
c’est-a-c 


La loi 
admise. 

La fo 
et - 
indépen 


pour tov 
Posor 


(cela co. 


oir 


ur 


Ir 
il 


ar 


DISTRIBUTION LIMITE 433 


Si 8, ~ 0, nous avons min (8;/(1 — ax) + Bx/(1 — ax) — Br) < Br/(1 — ax) 
et il résulte de (26) en vertu des raisonnements développés a l’instant que 
6(z) = 1; si c’est le cas 8, = 0 qui se présente, alors nous avons 


= &(z) 


et P(x) = 1 ou = 1 pour = — tandis que (x) = 0 pourz < 
6:/(1 — ax). Done, (x) étant une loi propre et a, vérifiant l’inégalité a, <1, 
nous voyons que a, ¥ 1 pour tous les r. 

Sia, > 1, ona pourz S B,/(1 — a,) 


et, par conséquent, 
P(a,x + B,) S #(x). 


D’ou, en vertu de (23), nous tirons 


(27) 0 pour Br 
1 — a 


Soit «x < Bx/(1 — ax). Pour tout e > 0 on peut trouver un n tel que 


ex art t+B(l + af") =z. 
Conformément & (25) et (27) nous obtenons 
(apex + + ax + +++ + af")) = = 0, 
c’est--dire que pour tout z < @;/(1 — a,x) nous avons 
&(z) = 0. 

La loi @(x) est done impropre, ce qui est en contradiction avec l’hypothése 
admise. 

La fonction (x) est donc telle que = 1 pour x = — = 


et << 1(# 0) pour x < B,/(1 — a) = 2%. Or, le point x étant évidemment 
indépendant de k, nous avons 


Be 


pour toutes les valeurs de k et de n. 


Posons 
ie) = Be ) 


1 — 


(cela correspond au déplacement de l’origine au point 6,/(1 — ax)). 


434 B. GNEDENKO 


Il est évident que 


B(axz) = (az + Br ). 


1 — a 
En vertu de (23) la fonction ®(z) vérifie l’équation 


(28) = B(z) 


pour tout entier positif k. La solution de cette équation fonctionnelle est bien 
connue (voir, par exemple, [4], page 95) la seule fonction de distribution vérifiant 
l’équation (28) et assujettie & la condition &(z) = 1 pour z = 0 est la fonction 
V(x). 

2) a, > 1 pour un certain k. II résulte des raisonnements exposés que 
a, > 1 pour toutes les valeurs de k. Nous avons déja vu (27) que 


&(x)=0 pour Be 
1 — 


La démonstration de l’inégalité > 0 pour x > /(1 — ax) résule de l’égalité 
1 1 1 
Qk Ok ak ak 
qui résulte facilement 4 son tour de (23). De cette méme inégalité il résulte que 
#(x) < 1 pour tous les x > @;/(1 — ax). D’une facgon semblable nous voyons que 


Bx Bn 


i-e 


pour toutes les valeurs de k et n, et que la fonction 


= + ) 


ak 


vérifie, pour tous les k > 0, l’équation (28). 
La seule fonction de distribution solution de cette équation et satisfaisant a 
la condition 6(z) = 0 pour z < O est la fonction ,(z). 
3) a, = 1 pour un certain k. II résulte de ce qui précéde que a, = 1 pour 
tous les k. En effectuant le changement de variable 
@(log z) pour z>0 
z= é, Be = e*, &(z) = 
0 pour 
nous réduisons |’équation (23) 4 la forme 
= B(z). 


La seule fonction de distribution satisfaisant a cette équation et 4 la condition 
(0) = Oest la fonction $,(z). Ainsi, nous avons 


=e°”. 


Cette fonction est du type A(x), dont nous avons parlé dans |’Introducion. 


unitaire 


D’aprés 


et 


ou 


Il est év 


tandis q 


Nous er 


Démc 
lin 
k—o 
existent. 
A = + 


DISTRIBUTION LIMITE 435 


3. Propositions auxiliaires 


Lemme 1. Soient F(x) et ®(x) des fonctions de distribution, &(x) n’étant pas 
unitaire. Si pour certaines suites de nombres réels a, > 0, bn, an > 0, Bn Ona 


F, + bn) — P(x). 


+ Bn) — 


pourn — © on aura pourn — 


1, be — Ba _, 9, 
an Qn 


D£MONSTRATION: Posons pour abréger 
= + bn) 
D’aprés l’hypothése du lemme nous avons pour n > © 
V(x) > 
et 
V,(Anx + B,) — ®(2), 
ot 


An An B, = Ba — bn 


— 


Qn an 
Déterminons une suite d’indices ny < m < +++ < nm, < -+> telle que les limites 
lim A,, = A, lim B,, = B (Q0SAS+%,-~ SBS +~) 


existent. Montrons que A < +. Supposons le contraire, c’est a-dire que 
A = +o, et désignons par 2 la borne supérieure des nombres z pour lesquels 


lim (An, + Bu) < +2. 
llest évident que pour tout x > x% ona 

lim (An, + Bn,) = +o, 
tandis que pout tout x < 2% 


lim (An, + Bn,) = 


Nous en tirons que 


_ JO pour % 
= {{ pour «> 2. 


et 
| 


436 » GNEDENKO 


Mais ceci étant exclu par Vhypothése du. lemme, nous avons A < +. [len 
résulte que B est fini lui aussi, puisque de ’hypotése B = — © résulterait pour 
toutes les valeurs de. x Végalité 


lim (Ant + Bn,) = — 


d’ou il résulterait, 4 son tour, que &(z) = 0; de méme l’hypothése B = +. 
entrainerait 


lim (4 dos + Bn,) = +o, 
d’ot il résulterait que (x) = 1. Il est évident que A > 0, puisque les a, et 
les a, jouent le méme rdle, et s’il était A = 0, nous aurions la relation 
limgsa (@n,/an,) = +, dont l’impossibilité vient d’étre démontrée. 
Soit x une valeur dite: qu’ aux points x et Ax + B la fonction &(z) soit con- 
tinue. II est alors évident que 


(29) = lim + By) = + B). 


Cette égalité devant avoir lieu pour tous les x, sauf les points de discontinuité 
au plus, il en résulte que A = let B = 0. En effet, admettons qu’il n’en n’est 
pas ainsi et considérons les cas qui peuvent alors se présenter. 

Pour A < l,en itérant (29), nous obtenons pour tous les x et pour 7 naturel 
et: arbitraire 


(xz) = + BIL + A+--- 


Et puisque A”x, pour n suffisamment grand, peut étre rendu aussi petit que 
lon ‘veut; nous avons pour tous les x 


= lim 6(A"x + + = 


La fonction (zx) n’est done pas. une fonction de distribution. 
Pour A > 1, nous écrivons (29) dans la forme 


et par des raisonnements analogues aux précédents nous arrivons & |’égalité 


= © 


qui doit avoir lieu pour toutes les valeurs de x, ce qui prouve que ®(x) ne peut 
pas étre une fonction de distribution. Prouvons enfin l’impossibilité du cas 
A=1,B #0. En effet, en vertu de (29) nous voyons que pour tout entier” 
on a 


= + nB). 


existent 


si 
suite d’ 


(30) 


serait rt 
B,, tens 
que 1 et 
étant 

Nous 

LEMM 
la relati 


pour n — 
Démo: 

de la fon 

grand 


a 
d’autre 
ce qui 
n’est ps 
d’indic 
alors por 


DISTRIBUTION LIMITE 


Si B > 0(B < 0) alors, pour tous les x et pour n > > — 
ur 
P(x) = (x + nB) + 
d’autre part, pourn — (n— +0) 
P(x) = + nB) > (— 
ce qui n’est évidemment possible que si (x) est constante, et, par conséquent, 
nest pas une fonction de distribution. On voit done que lorsque pour une suite 
d’indices {nx} les limites 
et lim A,, = A, lim B,, = B 
on 
existent, on a nécessairement A = 1, B = 0. Or, nous avons 4 démontrer que 
limA,=1, limB, =0 
si ’une ou l’autre de ces relations n’avait par lieu, on aurait évidemment une 
suite d’indices {nz} et un nombre e > 0 tels qu’une au moins des inéglalités 
lim lim |By| 2 
serait remplie; de plus, la suite des n;, peut étre choisie de manitre que A,, et 
" B,, tendent vers des limites fixes pour k — ©; or, ces limites ne peuvent étre 
que 1 et 0 respectivement, d’aprés ce que nous venons de démontrer; ce résultat 
étant en contradiction avec (30), le lemme est démontré. 
Nous aurons aussi & faire usage dans la suite de la proposition inverse. — 
ar Lemme 2. Si F(x) est une suite de fonctions de distribution donnant lieu a 
la relation 
lim + bn) 
pour un certain choix de constantes a, > 0 et b, et pour toutes les valeurs de zx, 
alors pour deux suites quelconques de constantes a, > 0 et B, telles que pour n — © 
(31) bn — Bn _, 0 
Qn an 
on a 


F + Bn) (x) 


pourn — 2 et toutes les valeurs de x. 

D£MONSTRATION: Soient 2, x et x2 (x1 < x < 22) des points de continuité 
de la fonction @(z). En vertu de notre hypothése nous avons pour n assez 


437 
eut 
cas 
T 
Bn — bn 
1 


438 B. GNEDENKO 
Puisque 
ant + Br = a, + bn) +b, 
an an 
on a, pour n assez grand, 
+ Dn < Ont + Bn < Ant. + 
et par suite 
+ bn) < F + bn) (AnX2 + bn). 


Tenant compte de (31) ceci montre cue 
< lim + Bn) lim Fa(ant + Ba) (22). 


En faisant x; et x2 tandre vers x, on aura 
> P(x), P(x) > 
x étant supposé point de continuité de (xz). On a done 
(x) < lim F,(anx + Ba) S lim + Bn) 


c.q.f.d. 

LEMME 3. Sz F(x) est une fonction de distribution et st pour un choix de con- 
stantes a, > 0 et b, on a, pour n — © et toutes les valeurs de x 
(32) F* (a,x + bn) > ®(2), 
(x) etant une fonction de distribution propre, alors on a pourn > ~, 

bn — batt _, 9, 
Anti an 

DEMONSTRATION. En effet, si la relation (32) a lieu, nous avons pour toutes 

les valeurs de x telles que (2) ¥ 0 


lim F(a,x + b,) = 1. 


D’out l’on tire que, pour n > © 
F"*"(a,x + bn) > 
Nous nous trouvons done dans les conditions du lemme 1 avec a, = Qn-1, 


Bn = brs, et ceci démontre le lemme en question. 
Lemme 4. Pour que l’on ait 


(33) F" (a,x + bp) > &(2) 
pour toutes les valeurs de x et pour n > ©, il faut et il suffit d’avoir pour n > ® 
(34) n{l — F(a,x + b,)] —log 


pour toutes les valeurs de x telles que ®(x) ¥ 0. 


DEN 
valeur 


(35) 


Ilest é 
(36) 
pour 


(37) 1 


D’oti n 
remplie 
aussi, ¢ 


D’ot, e 


THEC 
domain 


(38) 


pour tor 
montro 
®,(x). 
Il est 
résulte « 


sont po 
Défin 


(39) 


il résulte 
théorém 


DISTRIBUTION LIMITE 439 


DEMONSTRATION: Supposons que la relation (33) ait lieu; alors pour toute 
valeur de a telle que (x) # 0, nous avons 


(35) lim F(a,z + bn) = 1 


Ilest évident que pour ces valeurs de x la condition (33) équivaut a la suivante 
(36) n log F(anx + bn) — log ®(z) (P(x) ¥ 0) 
pourn — «©. Or, en vertu de (35), nous avons 
(37) log Fant + bn») = -(1 — + — 301 — + 

— = —(1 — + b,)) (1 + 0(1)). 


D’oli nous voyons que, dés que (33) a lieu, la condition (34) est nécessairement 
remplie. Inversement, si c’est la condition (34) qui a lieu, alors (35) a lieu 
aussi, done, en vertu de (37), 


— n{l — F(a,x + b,)] = n log F(anx + bn) (1 + O(1)). 
D’ou, et en vertu de (34), résulte (36) et, par conséquent, (33). 


4. Le domaine d’attraction de la loi ,(x) 


THEOREME 4. Pour qu'une fonction de distribution F(x) appartienne au 
domaine d’attration de la loi ®,(x) il ea et ill suffit qu Von ait 


(38) tim — 


pour toute valuer de k > 0. 
DEMONSTRATION: Supposons d’abord que la condition (38) est vérifiée et 
montrons que la fontion F(x) appartient au domaine d’attraction de la loi 


#,(x). 
Il est évident, d’aprés (38), que F(z) < 1 pour toute valeur de x. Il en 
résulte que, pour 7 suffisamment grand, les valeurs de x donnant lieu a l’inégalité 


1 — F(x) - 


sont positives. 
Définissons a, comme la plus iis des valeurs de x vérifiant les inégalités 


(39) 1 — + 0)) — 0) 


il résulte de ce qui précede que a, — © pourn— 2». D/aprés la condition du 
théoréme, pour toute valeur de x et pour tout «(0 < ¢€ < 1) nous avons 


1 — F(a,2) 1+ e\* 
) 


440 B. GNEDENKO 


et 


1 — F(an2) 
1 — F(a,(1 — e)) x 

n tendant vers l’infini. Les premiers membres de ces relations étant fonctions 

monotones de «¢ et les seconds membres étant fonctions continues de e, la con- 

vergence en question est uniforme, ce qui nous permet d’écrire, pour n > ©, 


1 — F(a,2) 1 
1—F(a,(1+0)) 


et 
1 — F(an2) 1 
1 — F(a,(1 — 0)) x" 


Or, puisque d’aprés (39) nous avons 


1 — F(a,2) 


on voit que pour tout x > Oona 


n(l — F(a,x)) a2 * 


1 — F(a,x) 
1 — F(a,(1 + 0))’ 


pourn— > 
D’aprés le Lemme 4 du paragraphe précédent on a pour n > © 


F" (a,x) ®,(2) +), 


Supposons maintenant que F(x) appartient au domaine d’attraction de la 
loi &,(x), c’est-a-dire supposons que, pour un choix convenable de constantes 
a, > 0 et ba, pour toutes les valeurs de x(a > 0) a lieu la relation 


(40) n(1 — + > x“ 


pour 2 — ©, et faisons voir que la condition (38) en résulte. 
Pour toute constante 8 > 1 nous avons pour n > « 


nB(1 — F(anx + bn)) 


Puisqu’il résulte de (40) que pour z > Oetn — ona 
1 — F(a,x + bn) 0 
nous voyons que, pour n — ©, on doit avoir 
[ng](1 — + bn)) Bx“, 
ou [np] désigne l’entier de nf. 


Par le changement de variable x = 26'/* cette relation prend la forme suivante: 


pour n — © et pour tout z > Oona 


(41) — F(anB"%z + 


Il rést 
(42) 


En ve 
les rel 


doiver 
chang 


(43) 


Poson: 


ou ler 
nombr 


D’otn 


et par | 
(44) 


Suppos 
trouver 


et, par 


et, pour 


Nous e1 
(45) 


Remarq 


DISTRIBUTION LIMITE 441 


[I résulte de (3) que pour n > © 
(42) [nB](1 — + > * 


En vertu des Lemmes 4 et 1 et en tenant compte de (4) et (5), nous concluons que 
les relations 


{ng} bingy — On +0 
an 


doivent avoir lieu pour n — o. En vertu du Lemme 2 la relation (42) ne 
changera pas si nous posons 


(43) Ging) = = On 
Posons 
Ms = [Ns-1,8], = [np] 


ou l’entier n est & considérer comme fixe. II résulte de (43) que pour tout 
nombre naturel s nous avons 
= ba, = dy. 


D’ou nous tirons que pour s > 

bn, —0 

Qn, 
et par conséquent, en vertu du Lemme 2, que pour s > ~ 


Supposons que y — ©; pour toute valeur suffisamment grande de y ou peut 
trouver une valeur de s telle que 


at Sy 
et, par suite 


1 — F(a@,,,,2) 1 — F(y) — F(@,,z) 


et, pour k > 0 
1 — F(a,,,,kx) S 1 — F(ky) S 1 — F(a,,kz). 


Nous en tirons l'inégalité 


(45) 1 = < 1- Fy) 
1 — F(a,,ke) ~ 1 — F(ky) 1 — F(an,4, kx) 


Remarquons que 


Ns 


442 B. GNEDENKO 


ot 0 S 6, < 1; done pour s > « ona 


Net1 — B. 
Ns 


Nous en concluons, en vertu de (44) et (45), que 
1 a 1-—F (y) a 
—_k* = lim Bk’; 
B yoo 1 — F(ky) 
or, puisque 8 peut étre choisi aussi peu différent de l’unité que l’on veut, la con- 


dition du théoréme en résulte. 

Faisons remarquer qu’il résulte de ce qui précéde que toute fonction de distri- 
bution F(x) appartenant au domaine d’attraction de la loi ®,(x) est attirée vers 
®,(x) d’une fagon plus particuliére, 4 savoir: pour un choix de constantes a, a lieu 


Végalité 


lim = 


5. Le domaine d’attraction de la loi V.(x) 


THEOREME 5. Pour qu’une fonction de distribution F(x) appartienne au do- 
maine d’attraction de la loi V(x) il faut et il suffit que 
1. al existe un x tel que 


F(a) =1 et Fl —6) <1 
pour tout e > 0. 


1 — + %) _ a’ 


pour tout k > 0. 
DEMONSTRATION: Supposons que les conditions du théoréme ont lieu et 


montrons que la fonction F(x) appartient au domaine d’attraction de la loi 
Y.(x). A cette fin nous définissons a, comme la plus petite des valeurs x > 0 
donnant lieu aux inégalités 


(46) 1 — F(— 2(1 — 0) +m) Sn — F(— x(1 + + %). 
D’aprés la premiére condition du théoréme nous avons pour n > © 
a, — 0. 


La deuxiéme condition du théoréme nous fournit les relations 


1 — F(a,x + 2) x y 
1 — F(—a,(1 + €) + 2) l+e 


1 — F(anx + 2) y 
1 — F(—a,(1 — €) + 2) l—e 


et 


pour 
ces re 
foncti 
nous" 


et 


Or, pl 
1 


i-F 


nous p 


d’ou, 
d’attrs 
Sup} 
Va(x), 
des a, 
(47) 
lorsque 
(48) 


il 
tout k, 
(50) 


ct que a 


= 
ou 
En rem 
(49) 
En con 
a, et b,, 
= 


)1 


DISTRIBUTION LIMITE 443 


pour tout « > 0 et x < 0, n tendant vers l’infini. Or, les premiers membres de 
ces relations sont des fonctions monotones de € et les seconds membres des 
fonctions continues de ¢; donc la convergence dont il s’agit est uniforme, ce qui 
nous permet d’écrire pour n —> « 


1 F (a,x + Xo) 
1 — F(—a,(1 + 0) + 2) 


— (—z)" 


et 
1 — F(anx + 20) 
1 — F(—a,(1 — 0) + 2) 
Or, puisqu’en vertu de (1) nous avons 
1 — F(a,x + 2) 1 — F(anx + 20) 
1 — F(—a,(1 + 0) + 2%) — F(—a,(1 — 0) + %)’ 
nous pouvons affirmer que pour tous les x < 0 et pourn — & on a aussi 


n(1 — F(a,x + a)) (—2)* 


— (—2)*. 


S n(l — Flanx + S i 


d’ot, en vertu du Lemme 4, on voit que la fonction F(x) appartient au domaine 
d’attraction de la loi V.(x). 

Supposons maintenant que F(x) appartient au domaine d’attraction de la loi 
V,(x), ce qui veut dire que pour toutes les valeurs de x et pour un certain choix 
des a, > O et b,, ona 


(47) F"(a,x + bn) > V(x), 
lorsque n — ©. Nous en tirons, pour n 
(48) + bn) = Valy2), 
oul 

y = 


En remplacant dans (48) yx par x, nous voyons que, pour n — © ona 
(49) (* + bn) W,(z). 


En comparant (47) et (49), nous concluons en vertu des Lemmes 1 et 2 que les 
a, et b, peuvent bien étre choisies de maniére 4 avoir 


=, bon = Da. 


D’ou il résulte que nous pouvons toujours faire ce choix de fagon 4 avoir, pour 
tout k, 


(50) bok, = bn 


ct que a, — 0 pourn— ~. 


444 B. GNEDENKO 


Si l’inégalité 
F(z) <1 


a lieu pour toutes les valeurs de x, nous tirons de (47), en y posant x = 0, que 
b, — © pour n > ©, ce qui est en contradiction avec (50). Nous avons done 
démontré la nécessité de la premiére condition du théoréme. Si la relation (47) 
a lieu, on doit choisir b, de fagon & avoir 


F"(bn) > Va(0) = 1 
c’est-a-dire qu’on doit avoir 
bn Xo. 
Et, en vertu de (50) et du Lemme 2, nous pouvons faire ce choix en posant 
(51) b, = Ze. 


En vertu du Lemme 4 et de (51) la relation (47) est équivalente & la relation 
suivante: pourn »©,ona 


(52) n(1 — F(a,x + (—2)*. 


Nous tirons tout d’abord de cette relation l’égalité lim,...a@, = 0. En effet,ona 
a,x + 2 < 2 pour x < 0 et pour que le premier membre de (7) tende vers une 
limite finie il est nécessaire’que l’égalité lim,4. (@n% + 20) = % soit vérifiée pour 


tout x < 0. 
Supposons maintenant que y — —0. Pour tout y < 0 suffisamment petit il 


est possible de trouver un 7 suffisamment grand pour avoir, soit 
Sy S 
Si S Qn, soit 
SY 
‘Si Gn SS Gn41. Dans le premier cas nous voyons que 
1 — + %) S 1 — Fly + %) S$ 1 — F(—an + %) 
et que, pour tout k > 0, 
1 — + 2) S 1 — Flky — %) S 1 — F(—ank + 
d’ou il résulte que 


1 — + 1 — F(ky + 1 F(-ank + Xo) 
1 — F(—a, + %) ~ 1— Fly +%) ~ 1 — + 


Dans le second cas nous obtenons d’une fagon analogue l’inégalité 


1 — F(—an,k + 2) < 1 — F(ky + 2) < F(—anyik + 20) | 
1 — F(—@n41 + 1 — Fy + a) ~ 1 — + %) 


Or, pl 
des ine 
condit 


Not 
que de 
et que 
1 pour 
voir qi 
en des 

EXE 


oll a > 


On t 


On a at 


si l’on | 


(53) 


alors 
(54) 


pour n - 


DISTRIBUTION LIMITE 445 


Or, puisqu’en vertu de (52) dans les deux cas envisagés les membres extrémes 
des inégalités obtenues tendent, pour n > ~, vers k*, nous arrivons & la deuxiéme 
condition du théoréme. 

6. Le domaine d’attraction de la loi A(z) 


Nous avons vu dans les paragraphes précédents que les lois ®,(x) n’attirent 
que des fonctions de distribution pour lesquelles on a F(x) < 1 pour tous les z, 
et que les lois ¥.(x) n’attirent que des fonctions pour lesquelles on a F(x) = 
1 pour une valeur finie de x et F(x») — €) < 1 pour toute > 0. Il est facile de 
voir que la loi A(x) attire des fonctions des deux espéces envisagées. Donnons 
en des exemples. 

EXEMPLE 1. Soit 


_ JO pour x<0 
pour 2z>0, 


ol a > 0 est constant. 
On trouve sans difficulté que, pour n > ©, ona 


F'(a,x 
ou 


a—l 


a, = * (og n)* , bn = (log 


EXEMPLE 2. Soit 
0 pour 
F(z) = 41 — por O<2z<1 
1 pour z>1. 
On a aussi, pour n — ~, 
F" (a,x + br) 


si lon convient de poser 


_  logn 
a, = log n, 
Lemme 5. Si, pour certains a, et b, et pour toutes les valeurs de x, on a 
(53) lim n(1 — F(an% + =e”; 
alors 
(54) — 0 


pourn— 


446 B. GNEDENKO 


D£MONSTRATION: En effet, de (53) nous concluons que, pour toute valeur 
de x et pour n 


(55) a,x 
si F(a) < 1 pour toutes les valeurs de x; et que 
(56) And + bn 7X, 


si F(a) = 1 tandis que F(a — < 1 pour toute > 0. Soitz = —A(A > 0); 
il est évident alors, en vertu de (55) et (56) que, dans le cas x > 0, on a pour 
tout A et pour n suffisamment grand 


c’est-a-dire 


A étant arbitraire, la relation (54) en résulte. 
Si xz) < 0, nous tirons de (56), en y posant x = 0, pour n — © la relation 


bn — 
Or, de cette méme relation (56), en y posant x = 1, nous tirons pour n > 


Qn 


d’ou a, — 0 pour n > ~& et, par conséquent, a,/b, 0 pourn — ©. Sic’est 
le cas x = 0 qui se présente, alors, bien que b, — x) = 0 pour n— ~, nous avons 
b, < 0 pour tout n suffisamment grand, puisque s’il n’en étsit pas ainsi nous 
aurions di avoir, pour x > 0, a,2 + b, et cela nous conduirait 4 l’égalité 


— F(anx + = 0 


ce qui est en contradiction avec l’hypothése du lemme. En vertu de (53), pour 
n suffisamment grand et pour tout x nous avons 


a,x + bn < 0; 
done, pour z > 0 


an 


Or, cette égalité ayant lieu pour toute valeur de 2, la relation (54) en résulte. 
Tutortme 6. Pour qu’une fonction de distribution F(x) appartienne au do- 
maine d’attraction de la loi A(x) il faut et il suffit que la relation 


(57) lim n(1 + F(anz + ba)) 


ait lieu 


petites 
(58) 
et les co 
(59) 


Dém 
fisantes 
d’attrac 

Inve! 
en verti 


(60) 


pour to 
Quel 
lieu les 


et 


Puisque 
nous por 


La prem 


et la sec 
linégalit 


| 
= 
| 
n(1 
De ce 
—b, 


DISTRIBUTION LIMITE 447 


ait lieu pour toutes les valeurs x, ou les constantes b, sont definies comme les plus 
petites valeurs de x donnant lieu aux inégalités 


(58) F(x 5 F@ +0) 
et les constantes a,, sont les plus petites valeurs de x satisfaisant aux inégalités 
(59) F(x(1 0) +b.) S1- +0) + by). 


DEMONSTRATION: En vertu du Lemme 4, les conditions énoncées sont suf- 
fisantes pour qu’une fonction de distribution F(x) appartienne au domaine 
d’attraction de la loi A(x). , 

Inversement, si F(x) appartient au domaine d’attraction de la loi A(x), alors, 
en vertu du Lemme 4, on doit avoir pour un certain choix des a, et Bn 


(60) F (nt + Bn)) 


pour toutes les valeurs de x et pourn > ~. 
Quel que soit « > 0, en vertu de (60), pour n suffisamment grand doivent avoir 
lieu les inégalités 


n(l — Flane + +1 <1 < n(l — F(—ane + — 9 
et 


1 
ol = —e). 
De ces inégalités et de (58) et (59) nous tirons, pour tout ¢, les inégalités 
—are+ S b, S ane + Br, 
an(1 €) 4 An + b, 4 a, (1 + €) + Bn. 


Puisque ces inégalités ont lieu pour tout « > 0, n étant suffisamment grand, 
nous pouvons choisir une suite e, > 0 (e, — 0 pour n — ~) de fagon & avoir 


—Qnén + Ba bn S anén + Bn 
a, (1 En) + Ss a, + a, (1 + + 
La premiére de ces inégalités nous fournit l’inégalité 


bn — B 


n n 


An 


Se 


et la seconde, en réunion avec l’inégalité obtenue 4 instant, nous conduit 4 
l'inégalité 


1| 
an 


448 B. GNEDENKO 
Par conséquent, en vertu du Lemme 2, nous pouvons affirmer que toutes les 
fois que la relation (60) est vérifiée pour un certain choix des a, et 8, , il en est 
de mémde (57), le choix des a, et b, étant effectué aeneomenies & (58) et 
(59). Le théoréme est done démontré. 

THEOREME 7. Pour qu’une fonction de distribution F(x) appartienne au do- 
maine d’attraction de la loi A(x) il faut et il suffit qu'il existe une fonction continue 
A(z) telle que A(z) — 0 pour z — 2 — 0 et que, pour toute les valeurs de x 


(61) 1 — F@) 
< + étant determine par le relations F(a) = 1, F(x) < 1 pour 


le nombre xo 

DEMONSTRATION: Supposons d’abord que F(x) appartient au domaine d’at- 
traction de la loi A(x). Alors pour un certain choix des constantes a, > 0 et 
b, nous avons 


(62) lim n(1 — F(ant + = €7 


Il en résulte que 
lim n(1 — F(b,)) = 1. 


pour toutes les valeurs de z. 


(63) 


Il est évident que |’égalite a, = 0 ne peut avoir lieu que pour un nombre fini de 
valeurs de n; nous pouvons donc toujours supposer que a, > 0 pour toutes les 
valeurs de n. II resulte du Théoreme 6 que nous pouvous considézer les b, 
comme fonctions non décroissantes de n. 

Posons A(b,) = a,/b, pour toutes les valeurs de n et, pour Dn S z S by, 
définissons A(z) de facon qu’elle soit fonction continue et monotone de z. Il 
est évident que pour tout z < 2 suffisamment grand il est toujours possible de 
détermier un entier n tel que b,.1 S z S bn. Ilen résulte que 


(64) 1 — F(b,) S$ 1 — F(z) S$ 1 — F(b,-1). 
En vertu de la définition de la fonction A(z) on doit avoir soit 
An-1 An 
< 
A(z) iy 
soit 


Dans le premier cas nous voyons que pour x > 0 
+ a(1 + A(z)zx) + bn 


et pour x < 0 


Ont + S 2(1 + A(z)x) S + dn. 


Done, da 
(65) 

et pour x 
(66) 1- 
De (63), 


(n — 


Or, ces in 
pour x > 
D’une 
ments an 
((0n/bn) 


de la con 
du théoré 
(67) 


En effet, 


nous pou 


D’ot 


Ces inéga 
pour 7 — 

Suppose 
que F(a) 
définisson: 


Nous en 


DISTRIBUTION LIMITE 449 


Donec, dans le premier cas, nous avons pour x > 0 

(65) $1— Fe $1 — + bru) 

et pour x < 0 

(66) 1 — + ba) S 1 — + A(@)x)) S 1 — + 
De (63), (64) et (65) nous tirons pour z > 0, les inégalités 


(n— 1)[1 — F(a,z + b,)] S 1 — + A(e)z)) n{l — + 
1 — F(z) 

Or, ces inégalités, en vertu des Lemmes 3, 2, et de la relation (62), impliquent, 

pour x > O, V’égalité (61). 

D’une fagon analogue nous obtenons (61) a partir de (66). Des raisonne- 
ments analogues nous conduisent de nouveau & (61) dans le cas non considéré 
A(z) S Cela achéve la démonstration de la nécéssité 
de la condition (61), Quant 4 la démonstration de la suffisance de la condition 
du théoréme remarquons tout d’abord qu’il résulte de (61) l’égalité 


1— F(g+0) _ 


En effet, puisque pour tout x > 0 nous avons, 
+ A(x)z)) 2 Fl + 0), 


(67) 


nous pouvons écrire 


1— F@+0) 1 — + A(z)z)) 


—— 1-— F(+0) 1 — F(z + 0) 

> > lim 

ts 1— F(z) ~ 1 — F(z) 
Ces inégalités ayant lieu pour toutes les valeurs de 2, elles subsistent 4 la limite 
pour « — 0. 

Supposons maintenant que les conditions du théoréme ont lieu et montrons 
que F(x) appartient alors au domaine d’attraction de la loi A(x). A cette fin 
définissons b, comme étant la plus petite valeur de x donnant lieu 4 l’inégalité 


1- F@ +0) 


Nous en obtenons que 


1 < n(1 — F(b,(1 + A(b,)x))) 
1 — F(b,(1 — A(b,)x)) 


1 — F(b, + 0) 


t 
r 
t 
i 


450 B. GNEDENKO 


En vertu de (61) et (67) nous voyons que 
lim n(1 — F(b,(1 + A(b,)x)) = 


En posant a, = b,A(b,) nous obtenons (62). Le théoréme est démontré. 
Du théoréme démontré résultent les propositions suivantes 

COROLLAIRE 1. Supposons que la fonction de distribution F(x) soit telle que 
F(x) < 1 pour toute valeur de x. Alors pour que la fonction F(x) appartienne au 
domaine d’attraction de la loi A(x) il faut que 


1 — Flky) _, 9 
1 — 


pour tout k > 0 constant et pour y > ~, c’est-a-dire que la suite des maxima soit 
relativement stable. 
DEMONSTRATION: Posons 


1 — F(a(l + A(z)z)) 
1 — F(z) , 


les fonctions ®,(x) sont non croissantes par rapport 4x2. Nous avons eu plusieurs 
fois l'occasion de faire usage de cette remarque que si une suite de fonctions 
monotones converge en tout point vers une fonction continue, la convergence 
est uniforme. En vertu de la convergence uniforme de ®,(x) vers e *, x, étant 
une suite tendant vers l’irifini pour z — ©, nous devons avoir 

(69) lim ,(z,) = lim e = 0. 


(68) ®,(x) = 


Prenons un a > 0 et posons 


= 
A(z)’ 
Par définition de la fonction A(z) nous avons pour z > © 


lim A(z) = ~. 


Il résulte de (68) et (69) que 


1— F((1 + a)z) _ 


On voit facilement que la condition nécessaire que nous venons de trouver n’est 
nullement suffisante. Pour le montrer considérons une fonction de distribution 
définie de la fagon suivante 


pour «<0 


F(x) = 
1—e¢ pour 


ot [x] 
a, et l 


pour t 
En ¢ 
donnar 


pour t¢ 


pour t 
m = 
par sui 


ait lieu 
Cort 
vy < Xo 


(70) 


ow la 
d’attrac 
Dém 
celle-ci 
Re. 
F(a) ne 
EXE) 


at 
que 


done la 


4Len 


DISTRIBUTION LIMITE 451 


ot [x] désigne lentier de x et Canehines qu’il est impossible de choisir les 
a, et b, de facon & avoir 
lim n(1 — F(a,z + =e” 
pour toutes les valeurs de x. 
En d’autres termes montrons qu’il ne peut pas exister de constantes a, et b, 
donnant lieu a l’égalité 


lanrt+by] 


lim ne 


pour tous les x ou, ce qui est équivalent, l’égalité 


lim (log n — [a,x + b,]) = 


pour toutes les valeurs de x, bien entendu. Considérons la suite partielle 
= [e‘*°*], ou k est un entier. Nous avons, pour k > ~», log n, — k > 0, 5; 
par suite, il ne peut pas exister de b,,, telles que la relation 


lim (log — [dn,-O0 + bn,]) = lim (log nz — [bn,]) = 0 


ait lieu. Il est évident que ceci démontre la proposition en question. 
CoROLLAIRE 2. Sort F(x) une fonction de distribution. S’il existe une suite 


Fim 0), 
(70) +0) 


ott la constante B est positive, la fonction F(x) ne peut pas appartenir au domaine 
d’attraction de A(x). 

DEMONSTRATION: L’inégalité (70) est incompatible avec l’égalité (67), et 
celle-ci résulte de (61). 

ReMARQuE. I] résulte des Théorémes 4 et 5 que si la condition (70) a lieu, 
F(x) ne peut pas appartenir aux domaines d’attraction des lois ®,.(x) et Va(x). 

Exemp.e. La loi de Poisson 


(0 pour 
F(z) = } k 


n’est attirée vers aucune des lois limites. En effet, en posant 2, = k, nous voyons 
que 
1— k+1, 
1 — F(k + 0) 


s>k s! 


done la condition (70) a lieu pour k + 1 > X. 


‘Le nombre 2» ayant le méme sens qu’au Théoréme 7. 


452 B. GNEDENKO 


Le théoréme suivant donne un critére nécessaire et suffisant simple de con- 
vergence vers la loi A(x) pour un choix particulier des constantes a, . 

TutorrkM 8. Pour que, pour un certain choix de la constante positive a et des 
contantes reelles b,, , la fonction de distribution F(x) satisfasse a la relation 


(71) F"(ax + bn) A(z) 
pour n — o, il faut et il suffit que l'on ait 
1 — F(log x) 
(72) 1 — F(log kz) 


pour toute valeur constante k > 0, ow aa = 1. 

D£EMONSTRATION: Si la condition (71) a lieu nous avons, en vertu du lemme 
5, Vinégalité F(a) < 1 pour toutes les valeurs de x et nous voyons que b, > « 
pourn ©, 

Ensuite, il est évident que la détermination des conditions sous lesquelles 
(71) a lieu équivaut 4 la détermination des conditions pour lesquelles on a 


(73) lim + ba) 


Posons 
x = logz, ba = log Bn, F(z) = F(log z). 
Il est évident que F(z) eSt une fonction de distribution. Dans ces conditions 


la détermination des conditions sous lesquelles on a (73) revient 4 la méme 
question pour 


(74) FT (6nz) = F"(log * 


Les conditions nécessaires et suffisantes pour que la relation (74) ait lieu ont été 
trouvées au §3; elles sont équivalentes 4 la condition (72). 

Un critére commode dans les applications pour savoir si la loi F(x), vérifiant 
la condition F(x) < 1 pour toute valeur de z, appartient au domaine d’attraction 
de la loi A(x), a été énoncé par Misés dans son travail cité. La condition de 
Misés consiste en ceci: 

Soit F(x) une fonction, admettant, pour tous les x supérieurs 4 une certaine 
valeur 2», des dérivées des deux premiers ordres. Posons 


f(z) = i — F(a) 


im |= 


° n _ 
tan 


alors, si 


ona 


OU € 
tion al 
conditi 


et que 
second 


alors F 


Ma 
Ac: 
1. M. 
2. R.A 
4. A.K 
| 


DISTRIBUTION LIMITE 453 


ou x, est la plus petite des racines de l’équation 1 — F(x) = 1/n. Une proposi- 
tion analogue peut étre démontrée pour une fonction ne satistaisant pas 4 la 
condition F(x) < 1 pour tous les x. S’il existe un 2 tel que, pour tout e > 0, 


F(m — « < 1, F(a) = 1 


et que, & partir d’un x < 2», la fonction F(x) admet les dérivées premiére et 


seconde et que 
dj; 1 


alors F(x) appartient au domaine d’attraction de la loi A(z). 


MATHEMATICAL INSTITUTE OF THE 
ACADEMY OF SCIENCES OF THE U.S. S. R. 


R&FERENCES 


1, M. Friécner. Sur la loi de probabilité de Vécart maximum. Annales de la société polo- 
naise de Mathématiques, t. VI, p. 93, Cracovie, 1927. 

2. R. A. Fisher anp L. H.C. Tieperr. Limiting forms of the frequency distribution of the 
largest or smallest member of a sample. Proceedings Cambridge Philos. Soc. V, 
XXIV, part II, p.p. 180-190, 1928. 

3. R. DE Misks. La distribution de la plus grande de n valeurs. Revue Mathématique de 
l’union interbalkanique, t. I, f. 1, p. 141-160, 1939. 

4, A. KHINTCHINE. Théoremes limites pour les sommes des variables aléatoires indépendantes 
(En russe), Moscou 1938. 

5. B. GNepENKO. To the theory of the domains of attraction of stable laws, Uchenye Zapiski 
of the Moscow State-University, t. 30, p.p. 61-81 (1939). 

6. B. pe Finetti. Sulla legge di probabilita degli estremi, Metron v. IX, No. 3-9, p.p. 
127-138 (1932). 


- 

| 
| 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


TRANSFORMATION GROUPS OF SPHERES 


By DEANE MontcoMErRy! anp Hans SAMELSON 
(Received December 7, 1942) 


I 


1. The compact Lie group G is said to be a transformation group of the space 
W or to act on the space W if the following conditions are satisfied: 

a) to every element g of G there is associated a homeomorphism g(z) [x in W] 

of W onto itself. 

b) if g: and are elements of G then g,[g2(x)] = (gige)(x). 

c) the point g(x) depends continuously on the pair (g, 2). 

Conditions a) and b) imply that to the identity element of G is associated the 
identity homeomorphism. The group G is said to act transitively if in addition 
to a), b), and c) the following fourth condition is satisfied: 

d) for any two points x and y of W there is an element g in G such 

that g(x) = y. 
When d) is satisfied we say that W is a homogeneous space under G. 

In this paper we take for W the n-dimensional sphere S” and study the 
question of what compact connected Lie groups can act transitively and effec- 
tively, (see 2 a) below), on S". In I we prove a theorem on the structure of 
such a group which shows us that our main concern in the study of this problem 
is with simple groups. In II we study the question for simple groups using the 
Killing-Cartan classification, and we find that in general only those simple groups 
can be transitive and effective on S“ which are well known to be so. In III 
we use our methods to draw some conclusions about the structure of certain 
subgroups of the rotation group of the n-dimensional sphere which we denote 
by R, . Otherwise expressed R,, is the group of orthogonal transformations of 
determinant 1 on n + 1 real variables. 


2. We begin by noting some definitions and facts which will be of use in the 
course of the paper. All groups considered are compact Lie groups and we make 
the usual convention that finite groups are special cases of these. Subgroups 
are always taken as closed. For theorems on topological groups and Lie groups 
see [13] and for a general discussion of transformation groups see [18]. 

a) Let G act on W as above. Let H be any subgroup and let x be any point 
of W. The set of points of the form h(x), h in H, is called the orbit of « under H 
and is denoted by H(x). We see that H acts transitively on H(x). Similarly 
if M is any subset of W then H(M) denotes the set of all points of the form 
h(m), hin H and min M. The elements g of G for which the transformation 
g(x) is the identity transformation of W form a normal subgroup G._ If @ 
is an arbitrary normal subgroup of G contained in Gp then the factor group 


1 Guggenheim Fellow. 
454 


G/G’ 

grouy 

that ¢ 
b) 


space 
set of 
open 
(in at 
differ 
on G, 
sends 
eleme 
It i 
cepts, 
arbit1 
have 
for w] 
the p 
there! 
above 
We st 
If \ 
inner 
becau 
with 1 
The 
W in 
sectio 
conta’ 
c) ’ 
decon 
F, if t 
“fiber 
fiber 
of Fa 
of F a 
and F 
points 
d) ' 
[8]) is 
subgre 
simply 
toral | 
rotatic 


TRANSFORMATION GROUPS OF SPHERES 455 


G/G’ acts in a natural way on W and G and G/G’ generate the same orbits. The 
group G is said to act effectively if G contains only the element e. We notice 
that G/G» always acts effectively. 

b) Let H be any subgroup of G. The set of left cosets of H is made into a 
space (decomposition space) by means of the natural topology as follows: a 
set of cosets is an open set, if the points of G belonging to these cosets form an 
open set in G. This space is called the coset space and is denoted by G/H 
(in analogy with factor group); it is a manifold the dimension of which is the 
difference of the dimensions of Gand H. The group G acts and acts transitively 
on G/H if to the element g we let correspond the transformation of G/H which 
sends the arbitrary coset bH into the coset gbH. The mapping which sends the 
element g of G into the element gH of G/H is called the projection of G onto G/H. 

It is well known that homogeneous space and coset space are equivalent con- 
cepts, as we now indicate. Let W be a homogeneous space under G. Choosing 
arbitrarily a point x of W let H be the subgroup of those elements h in G which 
have x as fixed point, h(x) = x. For each y in W consider the set of those g’s 
for which g(x) = y; this set is a left coset of H, and this correspondence between 
the points of W and G/H is a homeomorphism. The transformation g(x) may 
therefore be considered as a transformation of G/H; it coincides there with the 
above introduced mapping bH — gbH. We call H the associated subgroup. 
We shall frequently denote the group H here defined by the symbol G, . 

If we choose a different point x’ to start with we simply have to perform an 
inner automorphism of G in order to find the associated subgroup. This is 
because the subgroup leaving 2’ fixed is H’ = g'Hg’', where g’ is any element 
with the property g’(x) = 2’. 

The group Gp (cf. a)) of elements which induce the identity transformation of 
W includes all normal subgroups of G contained in H; it is in fact the inter- 
section of all groups conjugate to H. Therefore G is effective if and only if H 
contains no normal subgroup of G different from e. 

c) The coset decomposition of G with respect to H is a special case of a fiber 
decomposition [4, 16]. We call a manifold M fibered with the fiber (-manifold) 
F, if the following holds: M is decomposed into sets homeomorphic with F, the 
“fibers’’; every point of M is contained in one and only one fiber; and every 
fiber has a neighborhood in M which is homeomorphic to the topological product 
of F and a cell C, in such a way that a fiber is carried to a set which is the product 
of F and a point. The dimension of C is the difference of the dimensions of M 
and F. The decomposition space, which we get by considering the fibers as 
points, is a manifold (of the dimension of C). 

d) The rank r(@) of a compact connected Lie group (in this connection see 
[8]) is the dimension of a maximal Abelian subgroup of G, that is of an Abelian 
subgroup not contained in a larger Abelian subgroup (if G is not connected, we 
simply consider the component of the identity of G). Such a group is always a 
toral group, that is the direct product of a certain number of copies of the 
rotation group R, of the 1-sphere S'. All maximal toral subgroups are conju- 


ce 

1e 

h 

e 

of 

n 

e 

I 

n 

e 

yf 

e 

e 

S 

t 

y 

n 

) 


456 DEANE MONTGOMERY AND HANS SAMELSON 


gate, according to a fundamental theorem of Cartan. Each element of G lies 
on at least one maximal toral subgroup. The rank of a Lie group of positive 
dimension is always positive; we use in this paper “group of rank 0” as equiva- 
lent with “finite group.” We note that if H is a normal subgroup of G and G/H 
the corresponding factor group, then: 


r(@) = rH) + r(G/H). 


In particular, if G is the direct product of G; and G: (we denote this by G = 
Gi X G2), then every maximal toral subgroup of G is of the form 7, X T,, 
where 7; is a maximal toral subgroup of G; (¢ = 1, 2). 


e) Let Gi, --- , G, be (compact connected) Lie groups and let N be a finite 
normal subgroup of the direct product G = G; X G: X --- X G,. We say that 
the factor group G* = G/N is essentially the product of Gi,---,G,. It is 


known that every compact connected Lie group is essentially the product of 
some simply connected simple groups and a toral group (see [13]). 


3. We prove now a theorem on the structure of a group acting transitively on 
a sphere S". Let R; be the rotation group of the 1-sphere, and R, the simply 
connected covering group of the rotation group R, of the 2-sphere. The group 
R, may also, of course, be characterized as the group of quaternions of absolute 
value 1. 

TuHeorEM I. Let the compact connected Lie group G act transitively and ef- 
fectively on S". 

a) if n is even, then G is simple 

b) af n is odd, then G is either simple or essentially the product of two simple 
groups G; , G2, where G2 is either R, or Rz ; and the subgroup of G corresponding 
to G, is transitive on S". 

In the course of the argument we shall also prove the following theorem: 

THEOREM I’. Let G; and G2 be two compact connected Lie groups and let G = 
(Gi X G2)/N where N is a finite normal subgroup of G: X G2. If G is transitive 
on 8S” then one of the two subgroups of G corresponding to G; and G2 is transitive 
on 8". 

In proving these two theorems we consider G = G; X G2/N as given in the 
hypothesis of Theorem I’. We let G = Gi X G, and we let G act in the natural 
way on S". We note that if G is effective then G is almost effective in the sense 
that only a finite number of its elements (in fact the elements of N) are the 
identity transformation of S". Let H be the associated subgroup of G which 
leaves fixed an arbitrary but definitely chosen point x of S". We shall find it 
convenient to identify the coset space G/H and S" (see 2b)). 

Theorem I is trivial for the case n = 1 as it is known that the only compact 
connected Lie group which can be effective on S' is R,. Theorem I’ follows 
easily too because of the fact that for any x in S' G,(x) and G2(zx) are sets which 
are either manifolds or contain only a single point. If both these sets contained 
only a single point, namely x, then G(x) would also contain only the point 2. 


subgr 


4, 
fectiv 


prove 
G2. 
which 


Hence 
fore ¢ 
In 
assoc 
is sin 
We 
A) 
B) 
These 
conn 
do nc 
| 
Pro 
n is € 
(h; in 
of H 
so is | 
There 
splits 
G; (i 
topol 
(= x 
Now 
it foll 
the t’ 
This 
identi 
(cf. 2 
Let 
G = | 
there! 
dimer 
theor 


We 


TRANSFORMATION GROUPS OF SPHERES 457 


Hence either G(x) or G2(x) is a manifold of positive dimension and must there- 
fore coincide with S’. 

In view of the above remarks we assume from now on that n > 1. The 
associated subgroup H is then connected, because of the fact that in this case S” 
is simply connected [1]. We remark that we think of G, and G, as contained 
in G; we notice that gige = gog: (gi in G;) and G,N G = e. 

We shall also have need of the following statements about the ranks of G 
and H: 

A) If nis even, then r(G) = r(H). 

B) If nis odd, then r(G) = r(H) +1. 

These statements follow from the fact that the rank r(@) of an arbitrary compact 
connected Lie group G is equal to a certain homology invariant 1(G) which we 
do not define here [6] and to the fact that the statements A) and B) with r(@) 
replaced by 1(G) are known to be true [14]. 

Proceeding now with the proof of Theorem I we first consider the case where 
n is even and to begin with we do not assume G to be effective. Let h = hyhe 
(hi in G;) be any element of H. It is contained in a maximal toral subgroup T 
of H (cf. 2.d)). Because of 3 A) T is also a maximal toral subgroup of G and 
so is of the form 7; X JT: where 7; is a maximal toral subgroup of G; (cf. 2d)). 
Therefore the factors h; and hz of h are in T, and soin H. This means that H 
splits into a product H, X Hz, where H; is the intersection H NM G; of H and 
G; (¢ = 1, 2). But then the coset space G/H clearly decomposes into the 
topological product of the coset spaces G,/H,; and G2/H2 : 


G/H = x G2/He 


(= means “homeomorphic to”). On the other hand we have G/H = S". 
Now if a sphere S” is represented as the topological product of two manifolds, 
it follows from theorems on the homology of topological products that one of 
the two manifolds must be a point. We may suppose that G2/H:2 is a point. 
This means that G. = H2 ; a fortiori we have G: C H. Now Gz» being a normal 
subgroup of G we find according to 2 b) that the elements of G2 induce the 
identity transformation of S”. But then G, = G/G2 must be transitive on S” 
(cf. 2 a)); this proves Theorem I’ for even n. 

Let us suppose now that @ is effective. If G were not simple then 
G = G, X G2/N as before. Then, as noted before, G is “almost”’ effective, and 
therefore H contains no infinite normal subgroup of G. But G: , being of positive 
dimension, is infinite. This contradiction shows that G must be simple, and 
theorem I a) is proved. 


4. Now we let n be odd and for the present we do not require that G be ef- 
fective. Again we have G/H = 8S", and G = G,; X G.. We are unable to 
prove that H decomposes into the direct product of its intersections with G, and 
G.. Therefore we consider the smallest subgroup [ of G, which contains H 
which decomposes into a direct product: fT = Ti X I, where I; is a 


J 


458 DEANE MONTGOMERY AND HANS SAMELSON 


subgroup of G; (i = 1, 2). Obviously I; is the image of H under the natural 
homomorphic mapping gig2 — gi of G onto G; (i = 1, 2; g; in G,); it is clear from 
this that T, and T, are connected. Let H; be the intersection G; M H; we note 
that H; is a normal subgroup of I; (¢ = 1, 2). 

We show now that the coset spaces , are homeomorphic. 
The easiest way to see this is by considering orbits. Let x be, as before, the 
point of S", which corresponds to the associated subgroup H. Clearly we have 
= ,(x); for acts transitively on T;(x), and the associated subgroup, 
if we choose the point x for its determination, is the intersection of T'; and H, 
and this is H;. Similarly we have [/H = T(x). Now IT, and Is have the 
property that for each g; in I’; there is a go in T2 such that g,(x) = g2'(x), and the 
same with the indices interchanged; this follows immediately from the definition 
of T;, because the condition gi(x) = g2'(x) is equivalent with the condition 
in H. This shows that = and therefore, because = 
T,(T2(x)), both orbits are equal to I(x). Moreover it is clear that this set 
is the intersection Gi(x) NM G2(x); we denote it by F. 


5. We determine now the structure of F. Consider the coset space T;/H,, 
homeomorphic with it. Since H; is a normal subgroup of IT, , it is a (compact 
connected Lie) group. 

Suppose it is of rank 0 (cf.2d)). Being connected, it contains then only one 
element; this means T; = H,, and therefore also T; = H2. It follows that 
H=H,XH,. Hence the same argument as in 4 applies, and therefore theorem 
I’ follows in this case, that is under the assumption that the rank of T',/H, is 0. 

Suppose now that the rank of [',/H; is positive. We show that in this case 
the rank is 1. 

Let r, 71, r2 be the ranks of G, G, , G2 ; we have r = 7; + 1 (ef. 2.d)). The 
rank of H is r — 1 (ef. 3 B)). Let T be a maximal toral subgroup of H; its 
dimension is r — 1. It is contained in a maximal toral subgroup 7 of G; and 
this T is of the form 7; X T2, where 7; is a maximal toral subgroup of G; and 
has dimension r; (cf. nr. 2 d)). Now it is clear that the intersection 7 / 7; is 
of dimension at least r, — 1. That means that the rank of H, = Hf G; is at 
least r; — 1. On the other hand, I; being a subgroup of G,, the rank of T; 
is at most r;. Since H, is a normal subgroup of T',, we have the equation 
r(T;) = + It follows that r(T,/H;) is at most 1, and so, being 
positive, it isequal to 1. Because of the fact that [',/H; is of rank 1, it is homeo- 
morphie with one of the three following manifolds: the 1-sphere S', the 3-sphere 
S*, the projective 3-space P* [15]; the same holds of course for F. 


6. We haveG Dr = I; X [2 DH, and the coset space '/H is homeomorphic 
with F (nr. 5). The decomposition of G into cosets of T induces, because of 
T > H, a decomposition of G/H into sets homeomorphic with '/H; one verifies 
easily because of the analytic nature of all imbeddings involved, that this is a 


fiber 

have 
(This 
norm 
is fib 
expre 
G/T 
| the f 
and ¢ 
Wi 
of a: 
the 1 
S"/F 
corre 
of ¢ 
and ¢ 

basis 
the t 
eithe 
Then 

7. 
as we 
Go = 
(nr. é 
tive 
prove 
Su 
If yi 

y = 
comn 
H. 2 is 
grouy 
home 
to see 

So 
is 
Theo 


TRANSFORMATION GROUPS OF SPHERES 459 


fiber decomposition of G/H with the fiber '/H (ef. nr. 2 c)). In symbols we 
have 
G/T = G/H/T/H. 

(This corresponds to a well known theorem in group theory—for the case of 
normal subgroups.) The right hand side of this formula means that S"(= G/H) 
is fibered, and the fiber is ¥(= T'/H). In the left hand side we introduce the 
expressions G; X T; X for G, As in nr. 4 we have obviously G/T = 
G,/T,; X G2/T2. Finally we get: 


S"/F = X ; 


the fiber space S"/F is homeomorphic with the topological product of G,/T; 
and . 

We know now that F is a homology-sphere (manifold with the Betti numbers 
of a sphere), of dimension d equal 1 or 3. Thus we are in a position to apply 
the theorems of Gysin [5], which give the result that the homology ring of 
S"/F (with rational coefficients) has the following structure: it has a unit 1, 
corresponding to the fundamental cycle of the manifold; it has a certain element 
¢ of dimension d + 1 lower than that of 1; a certain power (€)” is 0-dimensional 
and different from 0; the elements 1, &, & --- , (€)” form a complete homology 
basis. It is now easy to see that this ring cannot occur as the homology ring of 
the topological product of two manifolds of positive dimensions. Therefore 
either G,/T; or G2/T2: must reduce to a point; suppose this holds for G2/T2 . 
Then we have G2 = Tr: . 


7. From the fact just demonstrated it follows that G, acts transitively on S” 
as we shall now show. The orbit G2(x) is equal to Ff, because F = T2(x) and 
G, = T2. Consequently we have G2(x) C G(x), because F = N Go(x) 
(nr. 5). Therefore G,(G2(x)) = G(x); but Gi(G.(x)) is G(x), and G being transi- 
tive on S", this is all of S". Thus we have G,(x) = S", and Theorem I’ is now 
proved completely. 

Suppose now that G is effective. By definition H»2 is such that H2(x) = x. 
If y is any other point of S” there is an element g; in G, with the property that 
y = g(x). Hence gHogi'(y) = y. In view of the fact that elements of G;, 
commute with those of Gz we have g:Hog;' = H.. It has thus been shown that 
Hz is a subgroup of Gp (nr. 2b)), that is H2 leaves every element of S" fixed. The 
group G is “almost” effective and consequently H» must be a finite group. But 
= G2/H2 is homeomorphic with and this means that itself is 
homeomorphic with a group of rank 1, that is with R; or R. or R.. It is easy 
to see that it must then be isomorphic with one of these. 

So far we have shown that, given any representation of G as essentially the 
product of two groups G; and G, then G,; must be transitive over S” and G2 
is isomorphic with R, or Re or Rez (indices chosen properly). This clearly proves 
Theorem I b, if we show moreover that G; is simple. 


| 
n 


460 DEANE MONTGOMERY AND HANS SAMELSON 

Suppose it is not simple; let G; = G’ X @”’ (possibly replacing G; by a finite 
covering group). We use now for G; the same argument we used for G and find 
that G’ is transitive over S” and that G” is homeomorphic with R, or Rez or R, . 
We write now G as G’ X (G@” X G2). The second factor G”’ X G, is not iso- 
morphic with R, or R: or R2 ; therefore, by what we have proved so far, it must 
be transitive on S”. But this is impossible for n > 3, because, again by the 
results obtained so far, one of the factors G’’, G; would have to act transitively, 
which is obviously impossible. The cases n = 2, 3 are treated easily for them- 
selves. Theorem I is now proved completely. 

It is worth pointing out that the possibilities mentioned in Theorem I may 
actually occur. There are groups which are essentially products of the type 
G, X R, and G; X R: , where G; is simple, which act transitively and effectively 
on odd dimensional spheres. The group A, (see for instance [14]) is defined 
as the group of all unitary matrices of determinant unity on n + 1complex 
variables. The group, call it A‘, , of all unitary matrices on n + 1 complex 
variables is essentially the product of A, and R,. It is clear that A’, is transitive 
on S°"** and it is also effective. Thus for every odd dimensional sphere S°’"*! 
of dimension greater than one there is a group of the type Gi X R, which is 
transitive on it. 

A group of the type G, X R, could not be effective on S*"*' for n = 1,2, -:-. 
For if it were S*"** would be fibered by sets homeomorphic to S’ or else by sets 
homeomorphic to P*. These fibers are the orbits of the second factor. Either 

case is impossible as has been shown by Gysin [5]. 

However there is a group of the type of G; X Rez which is effective and transi- 
tive on S*”", n = 1,2,---. Without giving details we shall merely mention 
that it may be obtained from C,, [see 14] in analogy with the way A’, is obtained 
from A, if we represent C, by means of linear transformations on sets of n 
quaternions [19]. 


II 


8. We now consider simple groups transitive on the sphere S”. According 
to the Killing-Cartan classification every compact connected simple Lie group 
is locally isomorphic to one of the following simple groups: R,, , the rotation group 
of the n-sphere S” (for n # 1, 3); A», the unitary unimodular group on n + 1 
complex variables; C,, the symplectic group on 2n complex variables; and 
five exceptional groups of dimensions 14, 52, 78, 133, 248 and ranks 2, 4, 6, 7, 8. 

The dimension of R, is n(n + 1)/2, that of A, is n(n + 2), and that of C, 
is n(2n + 1). The rank of R, is n/2 for even n and (n + 1)/2 for odd n; the 
rank of A, is n and the rank of C,, is n. The group R, is transitive on S” and 
A, and C, act in a natural way (as subgroups of the respective rotation groups) 
transitively on S’"** and S*”” respectively [14 p. 1126 ff.]. We shall speak of 
R,, An, and C, or any connected group locally isomorphic to them as the 
classical compact connected groups. 

It will be necessary for us to use the homology properties of these groups and 


9. 


group 
detail 


we th 
ring W 

and x 
the fo 

THI 

a) . 

b) . 

It is a 

The 

comp: 

homo! 

diviso 

G ont 

crease 

G/N 
2° [2] 

theref 

We 

the h 

This 

THI 

S" an 

a) 

produ 

Bef 

what 
orbit 

LE! 

transi 

an in 

Furth 

It i 

in thi: 

is con 

isome 


TRANSFORMATION GROUPS OF SPHERES 461 


we therefore list their homology rings. The symbol R(M) denotes the homology 
ring with rational coefficients of the space M, the symbol = denotes isomorphism, 
and x denotes as always the typological product. With these symbols, then, 
the following results are known (Brauer, Pontrjagin): 

THEOREM A 

a) R(An) = X SX 

b) R(C,) = R(S’ X S'’X x 

c) R(Ron) = R(S* X S’ X x 
d) R(Rna) = X X x XK SP"), 

It is also known that if G: is the exceptional simple group of dimention 14 then 


R(G:) = R(S* X 8S"). 


The rings of the four other exceptional simple groups are not known. If two 
compact connected Lie groups are locally isomorphic they have the same 
homology ring. This can be seen for example as follows. If N is a finite normal 
divisor of G, then the mapping taking G to G/N takes each homology group of 
G onto the corresponding group of G/N. Hence no Betti number can be in- 
creased by this mapping. *On the other hand if r is the rank of G, the rank of 
G/N as also r, and since the sum of the Betti numbers of both groups is therefore 
2’ [2] no Betti number can decrease. The Betti numbers of G and G/N are 
therefore the same, and it follows that the homology rings are isomorphic. 

We shall use the following theorem on groups transitive on S” which connects 
the homology properties of the group with those of the associated subgroup. 
This theorem has been proved by Samelson [14]. 

THEOREM B. Let G be a compact connected Lie group which acts transitively on 
S" and let H be the associated subgroup so that G/H = 8S". 

a) if n is odd then R(G) = R(H X 8S") and H is not homologous to zero in G. 

b) if n is even then R(H) = R([J X S"”*) where J] is a certain topological 
product of spheres of odd dimension and R(G) = R([] x S’"”*). 


9. We have already seen that if n is even then any compact connected Lie 
group which acts effectively on S” must be simple. We now examine in more 
detail the simple groups which can act in this way. 

Before we begin it will be convenient to prove a lemma which will be of use in 
what follows. We recall [18, p. 202] that if G is effective on an n-dimensional 
orbit then the dimension of G is S n(n + 1)/2. 

Lemma 1. If a connected compact Lie group G of dimension n(n + 1)/2 is 
transitive and effective on a simply connected n-dimensional manifold M then M in 
an invariant metric is isometric to S” and G is continuously isomorphic to R,, . 
Furthermore G, is isomorphic to Ra . 

It is known [1] that we may introduce an invariant metric in M and [3] that 
in this metric M is of constant curvature. Hence [7] in view of the fact that M 
is compact and simply connected it must be isometric to the sphere S”. The 
isometry 7 taking M to S" carries G to a compact connected group TGT™ of 


te 
1d 
O- 
st 
1e 
y, 
1- 
y 
ye 
y 
d 
X 
xX 
+1 
is 
r 
n 
| 
n 
4 
) 
) 
| 


462 DEANE MONTGOMERY AND HANS SAMELSON 


rotations of S" of dimension n(n + 1)/2. Therefore TGT™ contains all rota- 
tions of S” and we see that all the statements of the lemma are true. 

THEOREM II. If n is even, then except for a finite number of n’s (the exceptional 
values of n being S 114) the only compact connected simple Lie group which can be 
transitive on S” is locally isomorphic to R,, . 

The main step of the proof is contained in a theorem which we now state. 

THeorEM II’. If n is even the only compact classical group which can be transi- 
tive over S” is locally isomorphic to R,, . 

According to theorem B b) the homology ring of G must be isomorphic to that 
of a space containing S°”’ as a factor sphere. We know from Theorem A c) 
d) that if a group R,, has such a homology ring then m must be at least equal 
ton. On the other hand the dimension of G can not be greater than the dimen- 
sion of R, for if it were G could not be effective. Hence if one of the groups 
R,, is transitive on S", then m equals n. 

If a group A,, has a homology ring of the kind we have observed G to have 
then A,, must have dimension greater than that of R, and hence no A, can be 
transitive on S". By similar considerations we observe that the only group C,, 
which could be transitive on S” is C,,.. This group can not be transitive on 
S" as we see from Lemma 1. This concludes the proof of Theorem II’. 

Theorem II in its general statement follows from Theorem II’. In order to 
obtain a limit on the dimensions of even dimensional spheres on which there 
are exceptions one must proceed to the direct consideration of the five exceptional 
simple groups. Here, although slightly sharper results can be derived, we are 
content with the one already stated. The dimension of the highest dimensional 
exceptional group Gag is 248. We know that if this is transitive on an even 
dimensional sphere S" then 


R(Gus) = R(S"' K S"® 


where n; (¢ = 1, --- , 7) is an odd integer. Because it is known [2] that the one 
dimensional Betti number of a simple group vanishes n; is greater than or 
equal to 3. Hence 


91 + < 948 
n S114. 


IA 


10. We next consider the odd values of n and we consider separately the two 
possibilities n = 1 (4), and n = 8 (4). 

THEOREM III. Let n = 1(4). The only classical compact Lie groups which 
act transitively on S" are locally isomorphic to R, and A (n—1;2 . 

It follows as before that with a possible finite number of exceptions these are 
the only simple Lie groups transitive on S” with n = 1 (4). 

The proof of Theorem III is also based on the properties of homology rings. 

According to Theorem B a), G must “contain” S” as a factor sphere in its 
homology ring. Therefore it is clear at once that G cannot be C,, for any m or 


R,, fo 
it folle 


the ri 
isomo: 
for H 
group: 
If we 


group: 
more 
Bef 
use la’ 
space 
subset 
does 0 
of dim 
Let 
E 


and t 

LEN 
then 
group 

Let 
tions 
group 
that t 
no no 
There 
group 
effecti 
H, eit 

LEN 
greater 

Sinc 
fixed ¢ 
dim H 

The 


TRANSFORMATION GROUPS OF SPHERES 463 


R,, for any even m; and that the only possible R,, with m odd is R,. Moreover 
it follows, that if Gis a group A» , then m = “= f Suppose m > = E then 
the ring of the associated subgroup H would be, according to Theorem B a), 
isomorphic with R(S*’ X S’ xX --- This leaves 
for H only few possibilities: H could equal Ry or C2 or one of the exceptional 
groups F',E.i;Es , because no other group has a homology ring of this structure. 
If we take n sufficiently high, these possibilities are excluded and Theorem III 


is proved. 


11. Next we wish to prove: 

THEOREM IV. Let n = 3(4); except for a finite number of n’s the only simple 
groups transitive on S“ are R, , A(n—2, C(n44- ~The proof of this is considerably 
more difficult than the proofs of the two preceding theorems. 

Before beginning it is convenient to prove several lemmas which will be of 
use later. A stationary point of a group (or subset of a group) is a point (of the 
space on which the elements act) left fixed by every element of the group (or 
subset). 

Lemma 2. Let M be a subset of the group of rotations R,, acting as it ordinarily 
does on S". Then S(M), the set of stationary points of M, is a geometric sphere 
of dimension —1,0,1,---. 

Let S(M) be the stationary points of 17 when we consider M as acting in 
E,41. Then S(M) is a linear substance of E,4,. But 


S(M) = S(M) NS" 


and the conclusion follows.’ 

Lemma 3. If H is a connected closed subgroup of R,, of dimension n(n — 1)/2 
then H is continuously isomorphic to Rn» or R,-1, the simply connected covering 
group of R, . 

Let R,/H = M,, and let x be a point of M,, left fixed by H. The transforma- 
tions of H act linearly on the space of tangent vectors to M, atx. Hence H isa 
group of linear transformations of FE, and since H is compact we may assume 
that these linear transformations are orthogonal. We know that R, has either 
no normal subgroups or at most a normal subgroup containing two elements. 
Therefore either R, or R,/Z when Z contains two elements is effective. If a 
group is effective, a subgroup is also. Hence H is either effective or H/Z is 
effective where Z contains two elements. Hence, in view of the dimension of 
H, either H or H/Z is R,-. and hence H is either R,_; or R,1. 

Lemma 4. The group R, contains no proper subgroup whose dimension is 
greater than dim Ry-1. 

Since R, is simple, the group FR, has at most a finite set of elements leaving 
fixed all points of a coset space R,/H. Hence dim R,,/H = n and consequently 
dim H S n(n + 1)/2 — n = n(n — 1)/2. 

The set of elements of R, which leave fixed the unit point on the first & of the 


464 DEANE MONTGOMERY AND HANS SAMELSON 


n + 1 axes of E,,; is isomorphic to R,_;. This is one of many possible im- 
beddings of R,_, in R,. The particular subgroup picked out in this way we 
shall denote by Q,-.. The subgroup which leaves invariant the first axis we 
by Qn1. Wesee that = where Q*_, is the identity component 
of 

Lemma 5. If H isa proper closed subgroup of R, which includes Qn, , then H 
is either Qn-1 or Qn—1 

Since H can not have larger dimension than Q,_; , H must consist of a finite 
number of cosets of Q,-1. Consider the action of H on S" = R,/Q,_1, and let 
p denote the unit point on the first axis so that Q,-1 leaves p fixed. The group 
H is in the normalizer of Q,-1 , and hence each coset h Q,-1 takes x to a point 
left fixed by Q,-1. Hence H can include at mast two components and if there 
is a second one H must be Q,-;. 

Lemma 6. Let R,» be imbedded in any way in R,. Assume for every x of S", 
that Rmz (the subgroup of R», leaving x fixed) is conjugate in R» to one of the sub- 
groups Qm—1 or Qm—1 of Rm. Thenm S n/2. 

Let F denote the set of x’s such that Rn, is conjugate to Q,_; and let 0 denote 
the set of x’s such that En, is conjugate to Qn_1. The set F is closed, 0 is open, 
both F and 0 are invariant under R,, , and 


F+0= 8". 


In order to see that F is closed notice that if x is in F, then since Rn, is conjugate 
to Qm—1, it must be true that Q,_: has a stationary point in R,,(x). That is, 
F is equal to the totality of orbits of R» for points in S (Qn—1). We shall now 
show that F contains no points. Assuming that F does contain points we notice 
that S(Qm—1) must contain one and only one point on each orbit in F. That is 
S(Qm-1) is a cross section of F and F is the topological product of this cross 
section by an orbit of F. This is because we are dealing with a family of orbits. 
Each orbit of F is homeomorphic to m dimensional projective space P” so we 
have shown that F is homeomorphic to the topological product of S(Qm—1) and 
P”™. In view of Lemma 2 it follows that S(Qm_:) is a sphere of some dimension. 

We now define a homeomorphism T' of period 2 taking S” into itself whose set 
of fixed points is exactly F. We do this simply by leaving points of F fixed and 
by interchanging ‘‘diametral’’ points on orbits in 0, the concept of “‘diametral” 
points being defined on these orbits as a pair of points left fixed by precisely the 
same subgroup of R,. We see that 7’ is a homeomorphism as follows. Let 
x be any point of S”. If y is near x then R,», is “near” R,,. This implies that 
the set of stationary points of R,, is near the set of stationary points of Rn, . 
The existence of a 7 with these properties shows that F has the same homology 
groups mod 2 as a sphere of some dimension [17]. This is inconsistent with the 
above representation of F as a topological product of two manifolds unless F is 
vacuous. Hence we have shown that F is vacuous. 

But then S” is fibered (because of the analyticity of all the imbeddings in- 
volved) by sets homeomorphic to S”, and som S n/2 [5]. 


sional 
of the 
not b 
conne 
orbits 
Howe 
Let tl 
means 
conju 

In 
of an 

LE) 
an (n 


LEI 

We 
This 1 
sider 
that J 
We 
and 
We 
clear]: 
fore e 
= 4k 
Hence 

Now 
be los 
merel: 
last f 
His 
and 
We 


TRANSFORMATION GROUPS OF SPHERES 465 


Lema 7. Let H be a connected group of dimension n(n — 1)/2 imbedded in 
R,. Thenifn ¥ 3,n # 7 H is conjugate to Qn-1 . 

We know in any case by Lemma 3 that H is isomorphic to either R,_; or Rp . 
This means that we know the homology ring of H completely. We shall con- 
sider the action of H on the coset space R,/R,-1 = S", and we shall first show 
that H can not be transitive on S". 

We first consider the case where n = 2m. In this case 


and from Theorem B we see that H can not be transitive on S*”. 
We consider next the case where n = 2m — 1. In this case 


R(H) = X S’X 


clearly if 2m — 1 is of the form 5, 9, 13, --- H can not be transitive. We there- 
fore examine only the case where 2m — 1 is of the form, 3, 7, --- , that is 2m — 1 
= 4k — 1, so that 


R(A) = R(S* X X 
Hence if H/U = Sq_1 
R(U) = (SX S** x 


Now U is simple because only one S* appears on the right and it clearly can not 
be locally isomorphic to a classical group. Furthermore it can be checked, 
merely by enumerating possibilities that U can not be locally isomorphic to the 
last four exceptional groups. This leaves only one possibility, namely that 
H is Re 


R(H) = R(S* x 
R(U) = R(S* X S") 


and H/U = 8". 

We have now established that H is not transitive on S”. Because of dimen- 
sional considerations it must have an n — 1 dimensional orbit. This is because 
of the fact that if H did not have an orbit of dimensions at least n — 1 it could 
not be effective and at the same time have dimension n(n — 1)/2. In this 
connection see [18]. Hence all orbits are n — 1 dimensional, except for two 
orbits of lower dimension (see [11, and 18] and also the following lemma). 
However the only orbit of R,-: or R,_; of lower dimension than n — 1 is a point. 
Let this point be z. Then H isin R,,,. But R,, is conjugate to Q,1. This 
means that H is conjugate to a subgroup of Q,_; which can only happen if H is 
conjugate to 

In connection with the following lemma we refer to [11] where a discussion 
of an analogous question is carried out. 

Lema 8. Let G be a subgroup of R,, which in its action on R,»/Rn+~ = S” has 
an (n — 1) -dimensional orbit. Then all orbits of G are (n — 1) dimensional 


466 DEANE MONTGOMERY AND HANS SAMELSON 


except for two orbits of lower dimension. If one of these is not a point it carries a 
cycle linked with some cycle carried by the other. 

It is known that all orbits with two exceptions are n — 1 dimensional and that 
the decomposition space of the orbits of G is an are ab where the end points a 
and 6 correspond to orbits of lower dimension. The identity components of 
the groups G, and G, are conjugate if x and y are in n — 1 dimensional orbits 
[10, 11] and from this it can be seen that G, and G, are conjugate under the 
same circumstances [11]. This fact gives us a way of deforming sets from one 
n — 1 dimensional orbit to another, and in general a way of deforming a figure 
as long as we stay in the open set of n — 1 dimensional orbits [11]. This defor- 
mation can in fact be carried up to either of the exceptional orbits. These 
considerations together with the Alexander duality theorem prove the lemma. 

We are now ready to proceed with the proof of Theorem IV. 

For convenience put m = (n + 1)/4, so that n = 4m — 1. Leaving aside the 
exceptional groups (which means that we choose n high enough) we ask which 
of the classical groups can be transitive over S". From arguments like those 
above, namely by consideration of the homology ring of the associated subgroup, 
it follows that for n large enough, from the class A only A(n — 1)/2, and from 
the class C only C,, can be transitive. Similarly among the groups FR; with odd k 
only FR, can be transitive. Finally the only R; with k even, which is not ex- 
cluded by this consideration of the homology properties, is Re, . Consequently 
what we have to prove is that R:,, can not act transitively on S*”'. Suppose 
it does. The associated subgroup H, having x x & S*””®)-as its 
homology ring according to Theorem B a), must be locally isomorphic with 
either Rom_2 or Cue 

a) Suppose H is locally isomorphic to Cn. We consider Re», acting in the 
natural way on S°”; then H, as a subgroup of Re», , actson S’” too. By Theorem 
II’ it can not be transitive. On the other hand some orbit must be at least 
(2m — 2)-dimensional. For otherwise, since H is simple, all others would 
have to be points, and R2,, would not be effective. But Lemma 1 (Nr. 2) shows 
that H, being locally isomorphic with C,,; , can have no (2m — 2)-dimensional 
orbit. Therefore H must have at least one orbit of dimension 2m — 1. 

As we have already observed it then follows that all orbits with the exception 
of two are (2m — 1) -dimensional, and that the two exceptional orbits have 
lower dimensions. The orbit space, i.e. the decomposition space of the de- 
composition of S°” into the orbits under H, is an are, the end points of which 
correspond to the exceptional orbits. Now these two orbits can not, as stated 
above, be of dimension 2m — 2. Therefore they must be of dimension 0; be- 
cause a group acting effectively on a space of dimension k is of dimension at most 
(k(k + 1))/2; and C,_1 being simple, is effective or at least ‘‘almost”’ effective on 
any orbit not a point, and of dimension ((2m — 2)(2m — 1))/2. This means 
that H has a stationary point, and so is contained in the associated subgroup 
of Ro», , that is in Re». But this is impossible, as we shall now see by con- 
sidering the coset space Re» 1/H. By considering H as acting on the line ele- 


ments 
Rom—2 
which 
b) § 
ing in 
We 
of H ¢ 
then e 
are (2 
is con, 
cally 
morpl 
Then 
f 
to x, I 
and 
shown 
pointe 
We 
sional. 
of H. 
funda 
locally 
group 
next V 
orbits 
n-spac 
elemer 
transit 
point « 
point | 
contra 
Hen 
cally i 
homeo 
But 
this ec 


such th 
Pro 


TRANSFORMATION GROUPS OF SPHERES 467 


ments in the point of the space left fixed by it, we see that H is a subgroup of 
Rom—2 ; but having the same dimension as Re» 2 it must coincide with Re»—» , 
which contradicts the fact that it is locally isomorphic to Cp 1 . 

b) Suppose H is locally isomorphic with Re»-2. Again we consider R2,, act- 
ing in the natural way on S°”; and H acting on S*” as a subgroup. 

We know that H is not transitive on S’”. We shall now show that no orbit 
of H can be (2m — 1) dimensional. If some orbit of H is (2m — 1) dimensional, 
then except for two orbits H(x) and H(y) which are of lower dimension all orbits 
are (2m — 1) dimensional. If either H(x) or H(y) is a point then we see that H 
is conjugate to a subgroup of Qe». Furthermore it would have to be canoni- 
cally imbedded in Qem1 (lemma 7) and Rem;y would be homeomorphic to 
RomQom-2 but this space is not homeomorphic to S*”'. It is in fact homeo- 
morphic to the space of all line elements on S””. 

Since neither H(x) nor H(y) is a point they must both be 2m — 2 dimensional. 
Then H, , or at any rate its identity component, is isomorphic to Re»; (Lemma 
1). Let z be any point of S*” not in H(x) nor in H(y). If z is sufficiently near 
to x, H, is isomorphic to a subgroup of H, [11]. But dim H, = dim R2,,_3 — 1, 
and 2, 3 can contain no subgroup of this dimension. We have now completely 
shown that H can have no 2m — 1 dimensional orbit. Incidentally we have 
pointed out that H can have no zero dimensional orbit. 

We may now say that every orbit of H on Ren/Rom—1 = S’” is 2m — 2 dimen- 
sional. Before proceeding we pause to examine more carefully the structure 
of H. The group Re»—» has no finite normal divisors. On the other hand the 
fundamental group of Re» 2 is cyclic of order two. Thus there is only one group 
locally isomorphic to Re». and this group is the simply connecica covering 
group Rom—1 Of Rem-2. We may now say that H is either Re»—2 or Ren» and we 
next wish to eliminate the latter possibility. The only 2m — 2 dimensional 
orbits which the group Ro»: can have (Lemma 7) are S" and P” (projective 
n-space). The normal divisor of R:,,-2 contains the identity e and one other 
element which we shall calla. Since R22 is connected we see that if Roms is 
transitive on S" or P” then a must have a fixed point. But then a leaves every 
point of S" or P" fixed. Hence if H were R22 the element a would leave every 
point of every orbit fixed and so would leave every point of S°” fixed which is 
contrary to our hypothesis. It follows that H is the group Re,,_2 itself. 

Hence (Lemma 7) for every x, H, is conjugate to Qom—s (meaning the canoni- 
cally imbedded subgroup of R22) or to Qom—s depending on whether H(2x) is 
homeomorphic to or respectively. 

But then by Lemma 6, 2m — 2 must be less than or equal to m/2. With 
this contradiction the proof of Theorem IV is concluded. 


Ill 


12. THrEorEM V. Except for a finite number of n’s, R, has no subgroup H 
such that dim H = dim R,., —k,1 Sk Sn — 3. 
Proor. If such an H exists consider the action of H on R,/R,1~ = S". 


. 


468 DEANE MONTGOMERY AND HANS SAMELSON 


Since R,, is effective in its action on S”, it follows that H is also. We assume 
that H is connected, which merely means, that if it is not connected, we choose 
the component of the identity. 

In view of the results we have established we see that, with a finite nuraber 
of exceptional n’s, H can not be transitive in its action on R,/R,; = S". How- 
ever, on the basis of dimensional considerations, H must have some (n — 1)- 
dimensional orbit. By Lemma 8 we see that all orbits are (n — 1) -dimensional 
except two which we shall denote by H(x) and H(y). 

The group H can not be effective on H(x) and H(y) because their dimensions 
are too low. Hence there must be an invariant subgroup H, of positive dimen- 
sion of H such that every point of H(z) is stationary under H,, and a similar 
Hz associated with H(y). We assume that H, and H; are connected, which, 
again, is merely a way of saying that if they are not connected we choose their 
identity components. 

Since H, is invariant we see that if z is a stationary point of H; then every 
point of H(z) is also a stationary point of H;. A similar remark applies to H,. 
Thus if z is in S" — (H(x) + H(y)) z can not be a stationary point of H,, for 
if it were the stationary points of H; would separate S” which would imply that 
every point of S” is a stationary point of H, [12] and that H was not effective. 
For the same reason H; can have no stationary points in S" — (H(x) + H(y)). 

Therefore S(H;), the set of stationary points of H, is either H(x) or H(x) + 
H(y). In the latter case, the set S(H:) is not connected and from the fact 
(Lemma 2) that S(H;) is a sphere we see in this case that both H(x) and H(y) 
are points. This implies that for some g, gHg' C Rn-1 which is impossible. 

Consequently S(Hi) = H(z) and by similar arguments S(H2) = H(y). As- 
sume that (where 0 < 1; < n — 1) 


dim H(x) = 1, and dim A(y) = kh. 


Lemma 2 tells us that H(x) and H(y) are respectively 1;-dimensional and 1, 
dimensional spheres. 

At this point we shall need to recall (Lemma 8) that some cycle in H(x) must 
link some cycle in H(y). 

From this it follows that the linked cycles are actually the basic cycles of the 
homology spheres and that 


i1t+h=n-1. 


We notice now that no element of H; can leave all of H(y) fixed. The exist- 
ence of such an element would enable us to find a subgroup of H, which has 
H(y) as part of its stationary set. Such a group also has H(z) as part of its 
stationary set. The stationary set, being a sphere including these two linked 
spheres, must coincide with S” which is impossible. That is, H, is effective on 
H(y) and similarly H:2 is effective on H(z). 


Fron 


But 


and he 


At this 
lo < n 


This is 
1,(2n 


which 


But wv 
Theref 


These 
in the 


TRANSFORMATION GROUPS OF SPHERES 469 


From the choice of H,; we know that 


dim H/H, 
dim H, = dim H — at) 
= dim Ru 
9 
But 
1=n-1,-1 
and hence 
(n — 1)(n) — (n — — 1)(n — 13) 
12(2n le 1) 2k 
2 


2 


At this point we remark that we assume n = 2. We also assume as we may that 
1, < n/2. By hypothesis we know that k S n — 3. Now 


1,(2n — 21, — 2) — 2k = 12.(2n — 212 — 2) — 2n + 6. 
This is clearly positive when 1; = 1. Using the fact that 1, < n/2, we have 
1,(2n — 21, — 2) — 2n + 6 = 1.(2n — n — 2) — 2n + 6 
= — 2)(n — 2) +2 
which is clearly positive when 1, = 2. Hence we may always conclude that 


+ 1) 


dim H, < 


But we proved above that H; is effective on H(y) which is 1, dimensional. 
Therefore 


12(12 + 1) 


These two contradictory facts show that no subgroup H of the kind described 
in the theorem exists. 


dim H; = 


470 DEANE MONTGOMERY AND HANS SAMELSON 


From this theorem we see that except for a finite number of n’s R, can have 
no orbit H(x) whose dimension satisfies the following inequality 


n < dim H(x) S 2n — 3. 


We also point out the following theorem which is a corollary of our results. 

THreoreM VI. Let H be a subgroup of R, which is isomorphic to R;,k > 
n/2+2. Then with the possible exception of a finite number of n’s H is codjugate 
toQ,. 


Since H is a subgroup of R,, we may consider the action of H on R,/R,1 = 8S, . 
By the preceding theorem H can have no orbit H(x) such that 


k < dim S n. 


Furthermore because of a previous lemma not all orbits can have dimension k. 
Hence H leaves some point stationary which implies that H is conjugate to a 
subgroup of R2,-1. By a finite number of applications of our process we obtain 
the desired conclusion. 


SmitH COLLEGE AND 
UNIVERSITY OF WYOMING 


BIBLIOGRAPHY 


1. Cartan, La Theorie des Groupes Finis et Continus et l’ Analysis Situs, Memorial des 
Sciences Mathematiques, no. 42. 
2. Cartan, La Topologie des Groupes de Lie, Actualites Scient et Industr. 358; also in 
L’Enseignement Math. vol. 35 (1936), pp. 177-200. 
3. EISENHART, Continuous Groups of Transformations, Princeton, 1933. 
4. Fevpsau, Sur la classification des espaces fibres, C. R. Acad. Sci., Paris, vol. 208 (1939), 
pp. 1621-1623. 
5. Gystn, Zur Homologietheorie der Abbildungen und Faserungen von Mannigfaltigkeiten, 
Commentari Math. Helv. vol. 14 (1941-1942), pp. 61-122. 
6. Horr, Uber die Topologie der Gruppen-Mannigfaltigheiten und ihrer Verallgemei- 
nerungen, Annals of Math., vol. 42 (1941), pp. 22-52. 
7. Horr, Zum Clifford-Kleinschen Raumproblem, Math. Annalen, vol. 95 (1926), pp. 
313-339. 
8. Horr, Uber den Rang geschlossener Liescher Gruppen, Comment. Math. Helv. vol. 13 
(1940), pp. 119-143. 
9. MONTGOMERY AND ZIpPIN, Topological Transformation Groups I, Annals of Math., 
vol. 41 (1940), pp. 778-791. 
10. MonTGOMERY AND Zr1ppIn, A Theorem on Lie Groups, Bull. Am. Math. Soc., vol. 48 
(1942), pp. 448-452. 
11. MonrGcomery AND Z1ppr1n, A Class of Transformation Groups in E, , American Jour. of 
Math. To appear shortly. 
12. Newman, A Theorem on Periodic Transformations of Spaces, Quarterly Jour. of Math., 
vol. 2 (1931), pp. 1-8. 
13. PontrsaGin, Topological Groups, Princeton, 1939. 
14. Sametson, Beitrdge zur Topologie der Gruppen-Mannigfaltigkeiten, Annals of Math. 
vol. 42 (1941), pp. 1091-1137. 
15. Sametson, Uber die Sphdren, die als Gruppenrdéume auftreten, Comment. Math. Helv. 
vol. 13 (1940), pp. 144-155. 
16. SerreRtT, Topologie Dreidimensionaler Gefaserten Raume, Acta Mathematica, vol. 60 
(1933), pp. 147-238. 
17. Smiru, Transformations of Finite Period, Annals of Math., vol. 39 (1938), pp. 127-164. 
18. Zippin, Transformation Groups, Lectures in Topology, Michigan (1941), pp. 191-221. 
19. Jacopson, Semi-Simple Lie Groups in the Large, Annals of Math., vol. 40 (1939), pp. 
755-763. 


Vol. 4 
Al 
‘spac 
| abou 
prov 
show 
A pact 
be tk 
of Bl 
extel 
ques 
obta 
dom: 
some 
the 
In 
Hau: 
notic 
In 
§4 
to pl 
exter 
In 
In 
tion « 
7 we 
trary 
dorff 
Le 
of cel 
S sa 
1A 
publis 
32 (19 
2W 
from ¢ 
pos: 
\ 


OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


EXTENSIONS OF TOPOLOGICAL SPACES! 
By S. Fomin 
(Received February 8, 1943) 
Introduction 


An extension of a given topological space R is by definition any topological 
space containing f& as an everywhere dense subset. The first essential results 
about extensions of topological spaces were obtained by A. Tychonoff [1], who 
proved that every completely regular space can be immersed into a bicompact 
one. An essential strengthening of this result was given by Cech [2]. He has 
shown the existence for every completely regular space of an universal bicom- 
pact extension 8R, which is characterized by the following property: whatever 
be the bicompact extension bR of the space R, there exists a continuous mapping 
of BR on bR, leaving all points of R fixed. Another construction of the same 
extension was given by P. Alexandroff [3] in 1939. The method, which was 
applied in the paper of P. Alexandroff seems to us to be the most natural for the 
questions of the theory of extensions, and, slightly modified, it enables us to 
obtain from one and the same general considerations all facts, known in this 
domain (including the results of Stone on the H-closed extensions), as well as 
some new results, which, in our opinion, are of some interest. A knowledge of 
the paper of Cech is not presupposed in what follows. 

In §§1-2 of the present paper we give a general construction of an arbitrary 
Hausdorff extension of a Hausdorff topological space, and introduce some 
notions, which are essential for the sequel. 

In §3 we consider H-closed extensions of a Hausdorff space. 

§4 contains a generalization of the notion of continuity which is applied in §5 
to prove the existence, for every Hausdorff space, of an universal H-closed 
extension analogous to the Cech extension. 

In §6 the theory of bicompact extensions is given. 


1. A general method of construction of extensions of topological spaces 


In the course of the whole paper we shall confine ourselves to the considera- 
tion of Hausdorff topological spaces and their Hausdorff extensions. In §§6 and 
7 we shall assume even stronger separability axioms. In general, if the con- 
trary is not explicitly stated, by “space’’ we shall always understand a Haus- 
dorff topological space. 

Let R be an arbitrary space, and G—one of its bases. Let S be an aggregate 
of centered” systems 8, composed of the elements of this basis; we suppose that 
S satisfies the following conditions: 


1 An exposition of the main results of this paper has been given without proofs in my Note 
published under the same title in the Comptes Rendus de 1’Acad. des Sciences de l’URSS 
32 (1941), 114-116. 

2 We call the system 8 of non-vacuous sets A centered, if the intersection of any two sets 
from 8 belongs to 8. In accordance with definition it is supposed that the considered basis 
© possesses the property that the intersection of any two elements from G also belongs to G. 


471 


a 
3 
) 
= 


472 S. FOMIN 


1, For every point re R there exists a centered system 8, ¢S namely, the 
system of all neighborhoods of the point r, belonging to the basis ©. 

2. If 8; and & are two different systems from ©, then there exist open sets 
T, and from @ such that T;  , T2 andl, NT. = 0. 

Every set I, entering into the system 8, will be called a coordinate of this 
system. Thus I ¢ 8 means that ‘‘T is a coordinate of the system 8.” 

We now introduce in the set S a topology, by defining the system of neigh- 
borhoods of an arbitrary point 8 ¢ S in the following way?’ if T €8 : then the 
neighborhood Uy(%) consists of all points 8, such that T €8. 

It is easily seen that © is actually a topological space. From the condition 2) 
follows the existence of non-intersecting neighborhoods of any two points of S, 
Thus © is a Hausdorff space. 


To every point r e R may be correlated a centered system which is the system ' 


of neighborhoods of this point. It is evident that this is a one-to-one corre- 
spondence. It is also continuous in both directions. In fact, let the point 
rm €R and G be its arbitrary neighborhood, belonging to the basis @; then 
reGp if and only if 8, € Ug,. 

The image of the space R is a set, everywhere dense in S; and it follows from 
condition 1) that every set Ur contains a certain point 8,. Thus S contains an 
everywhere dense subset, homeomorphic to R, i.e. S is an extension of the space R. 

On the other hand, every extension X of the space R may be represented as 
a space of an aggregate of centered systems of open sets, satisfying the condi- 
tions 1-2. To this end it is sufficient to correlate to every point x « X the cen- 
tered system {U (x) M R}, where {U (x)} is the system of all neighborhoods of 
the point x e X, and to introduce in this set of centered systems a topology in 
the way indicated above. That the conditions 1 and 2 are satisfied, is obvious. 

Let us now formulate the obtained result: 

THEOREM 1. Every set of centered systems S, satisfying the conditions 1 and 2, 
is an extension of the original space R. Conversely, every extension of the space R 
can be represented as a space of a set of centered systems, satisfying the conditions 
1 and 2. 

Choosing in an appropriate way the set S of centered systems 8, we may, 
using the general method exposed above, construct different extensions of the 
original space possessing some a priori given properties. 


2. Some definitions. The spaces og, agk and agR 


Consider some special types of centered systems, which shall play a funda- 
mental réle in the sequel. Let a certain basis G be chosen in the space R. We 
introduce the following definitions. 

A centered system {G} of sets from @ is called a Hausdorff centered system 
if, for every point « e R which does not enter into a certain set G of the system 
{g}, there exists another set G’, also belonging to the system {G}, whose closure 
does not contain the point z. 


3 This topology is introduced along the lines of P. Alexandroff [3). 


if it. 
elem«e 

We 

Th 

if I’ 
lated 

= 

At 
lar), 
set 

Al 
regul 

A 

Wi 
respe 
of se 
in § 

Th 
a,R 
If G 
by o 
droff 

LE 
space 
{U(a 

PR 
{U(a 
non-i 
It re 
tered 
syste 
not 
But 

act 

TI 

wher 
easil; 
com] 
vacu 

TI 

TE 
space 


EXTENSIONS OF TOPOLOGICAL SPACES 473 


A Hausdorff centered system is called a Hausdorff end, if it is maximal, i.e. 
if it is not a true subsystem of any Hausdorff centered system composed of 
elements of the basis g. 

We shall require the following notions, introduced by P. Alexandroff [3]. 

The open set I’ is said to be completely regularly enclosed in the open set I, 
if Il’ ¢ I and if to every rational number ¢ of the segment [0, 1] may be corre- 
lated an open set I’, eG in such a way that from 4, < f follows T,, € T., and 
I’ = Te, r= 

I’ is regularly enclosed in I, if I’ | Pr. 

A centered system {G} of open sets from G is called regular (completely regu- 
lar), if every set of the system {G@} regularly (completely regularly) contains a 
set of the same system enclosed in it. 

A regular centered system is called a regular end, if it is not a subsystem of any 
regular centered system (composed of elements of the basis @). 

A completely regular end is defined similarly. 

We define now the topological spaces o,R, a,R and a,R, whose points are, 
respectively, Hausdorff, regular and completely regular ends = {G}, composed 
of sets Geg. The topology in these spaces is introduced in the way indicated 
in § 1. 

The space o,R shall be considered for an arbitrary Hausdorff space R, and 
a,R and a,R—only for a regular R and completely regular R, respectively. 
If G is the system of all open sets of R, the spaces o,R, a,R, a,R are denoted 
by oR, aR, a’R. The spaces aR and a’R were first constructed by Alexan- 
droff [3]. 

Lemma 1. Denote by {U(x)} the aggregate of all elements of the basis G of the 
space R, containing a given-point x of this space. {U(x)} is a Hausdorff end, 
{U(x)} is a regular (completely regular) end, if R is regular (completely regular). 

Proor. We lead the proof for a Hausdorff space R. A centered system 
{U(x)} is a Hausdorff system, since if x’ U(x), then 2’ ¥ zx, and there exist 
non-intersecting neighborhoods V(x’) and {U(x)}. Then 2’ éU®(z).. 
It remains to prove that {U(x)} is not contained in any larger Hausdorff cen- 
tered system. Let us assume the contrary. Let {U} be a Hausdorff centered 
system containing {U(x)}, and let its element be Uy) €{U(x)}. Then U> does 
not contain z, and consequently, there exists a U,;e«{U} such that réU,. 
But then there exists a U(x) not intersecting with U;, which contradicts the 
fact that {U} is centered. 

The other assertions of the lemma are proved similarly. 

We have thus proved that the spaces ogR, ag and agF satisfy (in the case, 
when for R the corresponding separability axioms hold) condition 1), §1. It is 
easily seen that they also satisfy condition 2), since two Hausdorff (regular, 
completely regular) ends coincide, if each coordinate of the one end has a non- 
vacuous intersection with each coordinate of the other end. 

Thus from the results of §1 follows the 

THEoREM 2. The spaces ogR, agk and agk are Hausdorff extensions of the 


space R. 


S 
) 
L 
} 


474 S. FOMIN 


3. H-closed extensions 


DeriniTion. A basis & of the space R is called algebraically closed, if from 
G €@ follows that also R — GeG. 

The following theorem will be essential in the sequel. 

— 3. If the basis & is algebraically closed, then the space o,R is H- 
closed . 

The proof of this theorem is based on the following criterion of H-closedness 
of a topological space. 

TueEorEM 4. Let @ be an arbitrary algebraically closed basis of the space R. 
This space is H-closed if and only if every Hausdorff centered system, composed of 
elements of the basis &, has a non-void intersection. 


THE Proor oF THEOREM 4 


The condition is necessary. If there exists a Hausdorff centered system £, with 
vacuous intersection, then may be adjoined to our space RF as a new (non- 
isolated) point, taking for the neighborhoods of ~ the sets ¢ U G, where G is an 
arbitrary coordinate of the system ~. The space R U ¢ is a Hausdorff space, 
since for an arbitrary x e R there exists a coordinate G of not containing z, 
and so, in virtue of the fact that the system ¢ is a Hausdorff system, there exists 
another coordinate G’ of ~, whose closure does not contain x. Consequently 
x and & possess non-intersecting neighborhoods. The space FR is not H-closed. 

The condition 1s sufficient. Let the space R be not H-closed. Let us adjoin 
to R a new (non-isolated) point ~, and consider the aggregate {U(é)} of all 
neighborhoods of the point £, such that RN U(é) «G. 

It is easily seen that the sets RM U(é) form a Hausdorff centered system. 

Let x be an arbitrary point of the space R. Since the space R U ¢ is a Haus- 
dorff space, there exists a neighborhood V(x) (belonging to the basis G), whose 
closure (m R U £) does not contain the point ~ and therefore coincides with the 
closure in R. Then, in virtue of the algebraical closedness of the basis G, the 
set R — V(x) belongs to G. Now the set (R U £) — V(z) contains £, and conse- 
quently, the set 


R— = RN (RUE) — 


is one of the coordinates of the considered centered system. The point z is not 
contained in this coordinate, and since x was chosen arbitrarily, the intersection 
of all coordinates of the considered system is void. Thus the sufficiency of the 
condition is also proved. 

Lemma 2. Jf © is an algebraically closed basis, and ogR is the space corre- 
sponding to this basis, then 


og R Ue = Ur-aé 


Proor. Let the Hausdorff end {If.} ¢€og R — U<; then there exists a neigh- 
borhood Ur such that Ur NM Ug = 0. But this means that TN G = 0 as well, 


4 The space R is called H-closed, if it is closed in every Hausdorff space, containing it. 


pl and t 
Ur-a 

{Ta} 

G= 
j Frc 
Co 
GeG@ 
Pr 
comp 
It is | 
{Ga}. 
Haus 
open 
ght 

In 
usefu 
DE 
f of - 
U(yo) 
U(yo) 
Al 
a 
the ¢ 
conv 
Ex 
be th 
of ne 
point 
ident 
of R; 
| TE 
Th 
Alex: 
suffic 

Gry 
TE 
mapy 
Rin 


EXTENSIONS OF TOPOLOGICAL SPACES 475 


and then we have also G = 0, ie. © R — Gand, consequently, {Ta} € 
Conversely, let {fa} Then Oforevery If now 
{Ta} then »(R — G), but this is impossible, since (R — G) N 
G = 0. 

From this lemma immediately follows: 

Corotiary. If © is an algebraically closed basis of the space R, then {Ue, 
Ge &} is an algebraically closed basis of ogR. 

Proor oF THEOREM 3. Let {U.,} be an arbitrary Hausdorff centered system 
composed of elements of the basis , Ge@}. Let us show that Uc, = 0. 


It is easily seen that the sets G. form themselves a Hausdorff centered system 
{Ga}. Complete the system up to a maximal. Then we obtain a certain 
Hausdorff end, i.e. a certain point ~ of the space ogR. It is obvious that every 
open set Ug, contains the point &, i.e. II Ua, #9. By Theorem 4, the space 


ogk is H-closed. 


4. @-continuous mappings 


In studying non regular Hausdorff spaces the following definition is very 
useful. 

DerIniTion. Let X and Y be two topological spaces. A one-valued mapping 
f of X into Y is called 6-continuous at the point x « X if for every neighborhood 
U(yo) of the point yo = f(xo) there exists a neighborhood U(x) such that f(U(a)) & 
U(yo). 

A (1-1)-mapping of X onto Y which is ¢-continuous in both directions is called 
a 6-homeomorphism between X and Y. 

A 6-continuous mapping of a space X into a regular space Y is continuous in 
the ordinary sense. In all cases a continuous mapping is 6-continuous, the 
converse being for irregular Y in general not true. 

Example. Let R; be the segment [0, 1] with its ordinary topology. Let R, 
be the same segment with a topology defined by means of the following system 
of neighborhoods. All points ~ 0 get their ordinary neighborhoods whereas 
the neighborhoods of the point zero are the semisegments [0, x] from which the 
points x, = 1/n are picked out. Rp is an irregular Hausdorff space and the 
identical mapping of R, onto R, is a 6-homeomorphism, (the identical mapping 
of R, on R, is even continuous). 

THEOREM 5. A continuous image of a H-closed space is H-closed. 

The proof is easily given using the well known criterion of H-closedness due to 
Alexandroff: In order that the Hausdorff space R be closed, it is necessary and 
sufficient that from every covering of R by open sets {G} a finite system G; , 
G2, --- , @, may be chosen in such a way that G, UG, U --- UG, = R. 

THEOREM 6. Let the two Hausdorff spaces X and Y have in common a point 
set R which is dense in X as well as in Y. Suppose that there exist 6-continuous 
mappings f and g of X on Y and of Y on X, respectively, which leave each point of 
R invariant. Then f and g are mutually inverse 6-homeomorphisms. 


476 S. FOMIN 


Proor. Let y = f(x) and g(y) = z. Assume x # z. Choose the neighbor- 
hoods U(x) and U(z) without common points; then U(x) N U(z) = 0. Take 
U(y) such that g(U(y)) & U(z) and r « R such that r ¢ U(z), f(r) € U(y). Then 


r = g(f(r)) Ng NUE, 


in contradiction with the choice of U(x), U(y). This contradiction proves the 
identity « = z and thus the Theorem 6. 

The proof of the following proposition presents no difficulty and may be left 
to the reader: 

THEoREM 7. If the space Y does not contain inseparable points (i.e. for any 
yr € Y, yo € Y there are neighborhoods U(y:), U(y2) with U(y:) A U(ys) = 0) and 
Y is a 6-continuous image of X, then X also does not contain inseparable points. 

From the Theorems 5 and 7 follows that the H-closed spaces without insepar- 
able points form a class invariant under @-homeomorphisms. The above ex- 
ample shows that bicompactness is not invariant under 6-homeomorphisms. 


5. The universal H-closed extension 


We shall prove the following fundamental theorem: 

THrorem 8. The space oR has the following properties: 

1°. The space cR may be 6-continuously mapped on every H-closed extension of 
R in such a way that all points of R remain invariant under this mapping; 

2°. Every H-closed extension of R possessing the property 1° is 6-homeomorphic 
with oR. 

Proor. If 1° is proved, then 2° follows immediately by Theorem 6. Let us 
prove the assertion 1°. The construction of the mapping f(x) below is the same 
as the corresponding construction in the paper of Alexandroff (proof of the 
Theorem II, pp. 408-409). 

Let S be an arbitrary H-closed extension of R and 


x = {Ta} eoR. 


Take the intersection 


where I’. means the closure of I, in S. 

We first prove that P is not vacuous. Denote by Ge = J(I%) the set of all 
inner points of P3. The system {G_} is centered. Now two cases are possible. 

a) The system {G,} is not a Hausdorff system. This means that there exists 
a point y; « S, such that for a certain G.,¢ x the point y; does not belong to Ga, 
while y; T% for all a, ie. Pe. 

b) The system {G.} is a Hausdorff system. Then by Theorem 4 the G, have 
a non-vacuous intersection obviously contained in P, . 

Now we prove that P, consists of one point only. To this end we prove the 


fron 
{U( 
Con 
of tl 
If 

and 

RN 
Pi 
we 
S ta 
is co 
dorff 
ie. f 
Le 
and 

such 
Let 
Lemr 
Top 
(1) 

| We h 

there 
with 
The 
The 
There 
does 1 


EXTENSIONS OF TOPOLOGICAL SPACES 477 


Lemma 3. If y € P, then for each neighborhood U(y) 
RN #0 


Proor. By the definition of the point y for every I’ ¢ x we have y € I’, where- 
from follows that for every U(y) it is (RN U(y) NT #0. The centered system 
{U(y)} being a Hausdorff one, the same is true for the system {R N U(y)}. 
Consequently {fT} U {RN U(y)} is a centered Hausdorff system. In virtue 
of the maximality of {Ta} it is {RM U(y)} © {Ta}. The lemma is proved. 

If now P., y2¢ Pz then choosing non intersecting neighborhoods U(y;) 
and U(y2) we see that {If} contains two non intersecting sets R MN U(y;) and 
RN U(y), which is impossible. 

Putting for every x = {Tu} eoR 


y = f(z) = [[™ 


we obtain a mapping of cR into S. To prove that f maps oR on the whole 
S take any ye S. The system {U(y)} is a centered Hausdorff system. Such 
is consequently the system {RM U(y)} as well. Let us complete it to a Haus- 
dorff end {fa} = x. Then 


f(z) = TIF um) = », 


i.e. f(x) = y, which proves the assertion. 
Let us finally prove that f is @-continuous. Take x ¢ oR, 


= =ye8 


and an arbitrary neighborhood U(y). We are looking for a neighborhood U(z) 


such that f(U(z)) U(y). 
Let G = U(y) N R. As R is dense in S we have G* = U(y)*. By the 
Lemma 3 it is Gex. Define 


U(x) = Ue 
To prove the relation f(U(z))"* U(y) take any x’ U(x)". Then 
(1) I’M G # 0 for an arbitrary I’ ¢ 2’. 


We have to prove that y’ = f(x’) e U(y)*. Suppose it is not the case. Then 
there exists a neighborhood U,(y’) such that U(y) N Ux(y’) = 0. Take T) = 
Ui(y’) AR. By the Lemma 3 it is Ty) ex’. But Ty M G = 0 in contradiction 
with (1). 

Theorem 8 is thus completely proved. 


6. Bicompact extensions 


The present paragraph deals with bicompact extensions of the given space R. 
Therefore R must be supposed completely regular, since otherwise the space R 
does not possess Hausdorff bicompact extensions at all. 


| 

| 

| 

ll 


478 S. FOMIN 


THEOREM 9. Each of the spaces oR, ak, a'R may be continuously mapped on 
every bicompact extension B of R in such a way that all points of R remain invariant 
under this mapping. 

The proof of this theorem is quite similar to the proof of the first part of Theo- 
rem 8: it follows directly from the bicompactness of B that the intersection 


P, = 


is non-vacuous for every « = {Ta}. The mapping f on B will be continouus 
(and not only 6-continuous) by virtue of the regularity of B. 

TuroremM 10. If the space R is completely regular then agR is also completely 
regular. 

Proor. For the sake of simplicity we shall suppose that the basis G is the 
system of all open sets of the space R, i.e. we consider the space a’R. 

We call the open set ' € R canonic if fT = R\(R\P), ie. if all inner points 
of T belong to T. Alexandroff has proved (3, p. 410) the following 

Lemma. The sets Uy , where T is canonic, form a basis of a’R. 

Let now x = {I.} be an arbitrary point of the space a’R. Let us show that 
if the set T: is completely regularly enclosed in a canonic open set T;, then 
Ur, S Ur, , ie. Ur, N (e’R\Ur,) = 0. Assume that this is not so. Then 
there is a point = {G} such that Up, N (a’R\Ur,). The inclusion Up, 
means that for each G ¢ £ the intersection GM IT. is non-vacuous; & e Up, would 
mean that I, €é, ie. that there exists a Geé contained in T;. Thus &@ Uy, 
means that no G ¢é is contained in I), i.e. that none of the intersections GN 
(R\T;) vanishes, and then (since is canonic) all GN are different from 
zero. Thus, every G ¢ — has a non vanishing intersection with I: as well as with 
R\T,. In other words, there exists a completely regular end, such that each of its 
coordinates has a non-vacuous intersection with both of the open sets T: and R\T,, 
the closures of T: and R\T, being completely separated. Let us lead this result 
ad absurdum. 

Lemma 4. Jf G, and G. are open sets with completely separated closures and 
G = (G, UG.) €&, then either G; € &, or € €. 

Proor. If the coordinate G’ of the point — is contained in G, then G’ falls 
into two sets Gi , and G , one of which is contained in G; , and the other in G:. 
Thus a completely regular centered system of sets § = {G’} generates two sys- 
tems {G;} and {G2}. At least one of the systems ¢ U {Gj} and & U {G2} must 
be centered and completely regular, and must, consequently, coincide with &. 
The lemma + is proved. 

Lemma 5. Let f(x) be a function, equal to zero on T: and to one on R — 1; 
let 0 <a<b < 1, and [(a, b) be the open set, whose points x satisfy the condition 
a < f(x) < b; then every Ge & has with T(a, b) a non vacuous intersection.” In 
fact, otherwise G would fall into two sets G; and G2 with completely separated 
closures, such that = 0 and G,N (R — T;) = 0, and, in virtue of Lemma 4, 


5 Hence, in particular, follows that I'(a, b) itself is non-vacuous. 


pletel 
has a 

Th 

To 
theore 
pletel 
syster 


Then 

inters 
We 
LEN 
Thi: 


Suck 
extensi 


® See 


eith 
Ee 
Li 
all I 
cent 
and 
com 
and 
larly 
regu. 
num 
that 
= 
quen 
the s 
TR 
comp 
Th 
there 
bicon 
TH 
Lemm 
Cor 
of the 
The 
Fro1 
Cor 


ut 


EXTENSIONS OF TOPOLOGICAL SPACES 479 


either G,, or G: would enter into £, and this contradicts the assumption that 
te Ur, NM (a’R — Uy,). The Lemma 5 is proved. 

Let us choose a and bo such that 0 < ap < bo < 1. Consider the aggregate of 
all I'(a, b), for which 0 < a < ay < bp <b <1. They form a completely regular 
centered system. The system éU {I(a, b)} is, according to the above, centered, 
and since £ and {I(a, b)} are completely regular systems, ¢ U {I'(a, b)} is also 
completely regular. But I'(a, b) and, consequently, P'(a, b)N (R — = 0 
and £ é (a’R — Uy,), ie. Ur, Cc Ur,. Thus we have proved that Ur, is regu- 
larly enclosed in Uy,. It is easily seen that this inclusion is also completely 
regular. In fact, because of the complete regularity of R , to every rational 
number t, 0 < ¢ < 1, may be put in correspondence an open set I’; in such a way 
that is completely regularly enclosed in , if < T; = fort = 1 and 
=: fort = 0. By what has been proved, Ur,,, ifs Conse- 
quently Ur, is completely regularly enclosed in Ur,. We have proved that 
the space a’R is completely regular. 

TuEoREM 11. Jf G is an algebraically closed basis, then the space agl is bi- 
compact. 

The proof of this theorem is analogous to the proof of theorem 3, and will be 
therefore only sketched here. We use in this proof the following criterion for 
bicompactness of a completely regular topological space: 

THEorEM 4’. <A completely regular space R is bicompact if and only if every com- 
pletely regular centered system of elements of a given algebraically closed basis of R 
has a non-vanishing intersection. 

The necessity of this condition follows from Theorem 4. 

To prove that this condition is sufficient, let us remember that by a known 
theorem of Tychonoff [1] a non-bicompact space R can be immersed in a com- 
pletely regular space R U ¢ where £ is a non-isolated point. Let {U()} be the 
system of all neighborhoods of , whose intersections with R belong to G: 


(UE) NR) €G. 


Then the system {U(¢) NM R} is a completely regular centered system with a void 
intersection. Thus the announced criterion is proved. 

We need furthermore the following 

Lemma 2’. If G’ is an algebraically closed basis of R, then agR — Uc = Ur-a. 

This lemma is proved in the same way as Lemma 2. It follows from 
Lemma 2’: 

Corottary. The open sets Ug corresponding to an algebraically closed basis © 
of the completely regular space R form an algebraically closed basis of ag. 

The rest of the proof of Theorem 11 is the same as that of Theorem 3. 

From Theorems 9 and 11 follows immediately: 

Corotiary. The space a’R isa maximal bicompact extension of R. 

Such an extension being unique,’ the space a’R coincides with the Cech 
extension BR. 


® See [3], Lemma IX. 


n 
_| 
)- 
IS 
y 
e 
it 
n 
n 
id 
m. 
ts 
ly 
It 
ud 
st 
15 
on 
In 
ed 
4, 


480 S. FOMIN 
Vol. 
Remark. Thus we have given a direct proof of the fundamental identity 

a’R = BR for any completely regular R; the proof of this identity given in [3] 

makes essential use of other properties of the space BR. 

Let us now see, under what conditions the space R has a minimal bicompact 
extension, i.e. an extension, which is a continuous image of every other bicom- 
pact extension of the space R. It is obvious that such an extension exists for 
every locally bicompact R, and is obtained by adjunction to R of one point, 


Let us prove that, conversely: 


If the bicompact extension B of the space R is minimal, then it is obtained by T 
adjunction to R of one point and, consequently, exists only for a locally bicompact R. thec 
In fact, let B be a bicompact extension of the space R, such that B — R con- and 
tains more than one point. Let «eB — RandyeB— R. Consider the space exte 
B’, which is obtained from B by identifying the points x and y. This space is + in Ww 
Hausdorff, bicompact and isan extension of R. Thespace B may be in an obvious cont 
manner continuously mapped on B’ so that all points of R remain fixed. B conc 
cannot be a continuous image of B’, since otherwise the mapping of B on B’, anal 
indicated above, would be a homeomorphism (Theorem 6); but at the same time pare 
this mapping is not one-to-one. decis 
Thus, our assertion is completely proved. tions 
tially 


MATHEMATICAL INSTITUTE OF THE 


AcADEMY OF SCIENCES OF THE U.S.S.R. to cc 


only 
ITATIONS 
{1] A. TycHoNnoFF, Uber die topologische Erweiterung von Réumen, Math. Annalen 102 (1929), coord 


544. 
[2] E. Cecu, On bicompact spaces, Annals of math. 38 (1937), 823-845. ‘ In 
* [3] P. ALexanprorr, Bikompakte Erweiterungen topologischer Réiume, Rec. Math. 5 (1939), = wh 
ab wii 


403-423. 

defini 

We 

the gr 

makes 

real di 

differe; 

We 
consid 
Hist 
Claude 

The 
which ; 
van der 
make t! 
Only as 
The 1 
is due e: 


' 


le 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


FOUNDATIONS OF THE THEORY OF LIE GROUPS WITH 
REAL PARAMETERS 


By P. A. Smita 


In memory of E. R. van KaMPEN 
(Received November 7, 1942) 


The material presented below amounts roughly to the so-called fundamental 
theorems of Lie and their implications concerning Lie algebras, Lie subgroups 
and subalgebras. By straightforward and fairly elementary steps, we shall 
extend the concept of Lie group to include groups admitting coordinate systems 
in which the functions defining ab, the group product of elements a, b, possess 
continuous first order derivatives in the coordinates of a and satisfy a Lipschitz 
condition in those of b. That such groups are equivalent to classical (i.e. 
analytical) Lie groups was announced in 1936 by van Kampen, although ap- 
parently van Kampen published nothing by way of proof beyond a certain 
decisive uniqueness theorem concerning systems of ordinary differential equa- 
tions. (This is contained in our theorem (12.1); the proof given below is essen- 
tially that of van Kampen [3]). It is conceivable that the condition relative 
to coordinates of b could be weakened or dispensed with entirely. For, the 
only part the condition in question plays in our development is to make it 
certain that the inverse of a certain mapping—the mapping into “canonical 
coordinates”—is single-valued. 

In this connection mention should be made of the paper [1] of Garrett Birkhoff 
in which it is shown that the existence of continuous first order derivatives of 
ab with respect to the coordinates of both the a’s and the b’s is sufficient for 
defining a Lie group. 

We shall consider only groups with a finite number of real parameters (whereas 
the groups considered by Birkhoff are not necessarily finite dimensional). This 
makes it possible to use the standard existence theorems for systems of ordinary 
real differential equations. We make no use whatever of the theory of partial 
differential equations. 

We maintain throughout a purely local point of view in the sense that we 
consider only what happens in the neighborhood of the identity. 

Historical notes. The proof of the uniqueness of square roots (8.5) is due to 
Claude Chevalley (unpublished). 

The proof given below of Lie’s theorem that a linear system of vector fields 
which is closed under commutation defines a Lie group is, we believe, due to 
van der Waerden [4]. We have supplied a number of preliminary lemmas which 
make the proof rigorously applicable to the case in which the vector fields are 
only assumed to possess continuous first order derivatives. 

The theorem in §17 that every local subgroup of a Lie group is a Lie group 
is due essentially to E. Cartan [2]. 

481 


9), 

9), 


482 P. A. SMITH 


That the functions ¢; defined in (14.16) are solutions of the so-called equations 
of Maurer (16.14) was first pointed out by Whitehead [5]. 

The existence of the functions a‘ (a an element of a given group, ¢ a real 
variable) was first established intrinsically by Garrett Birkhoff [1] although by 
methods quite different from those given below (8.6). A simple topological 
theorem which we require in this connection is proved in the appendix. 


1. Local groups 


The “groups” of classical Lie theory are not abstract topological groups in the 
contemporary sense since products are generally defined only in a neighborhood 
of the identity. Such partially defined groups have in recent years been called 
group nuclei, group germs, partial groups, local groups. Following Pontrjagin 
(Topological Groups) we adopt the name local group. It is to be understood 
that a space means a Hausdorff space. 

Derinition. A local group is a space G = {a, b, ---} together with a function 
of composition, denoted by ab, satisfying the following conditions: (1) wherever 
defined, ab is single-valued and continuous in the pair (a, b); its values are elements 
of G; (2) of a(be), (ab)c are defined, they are equal; (3) there exists a unique element 
1, the identity, such that 1a and a1, when defined, are equal to a; (4) denoting by 
a any element a’ such that aa’ and a’a are defined and equal to 1, the function a 
is single-valued and continuous wherever defined; (5) there exists a neighborhood U of 
1 such that ab and a are defined for every a, bin U. We shall call U a nucleus of 
G. Evidently every open subset of U which contains 1 is also a nucleus. G isa 
topological group if G itself is a nucleus. 

Let A, B be subsets of a local group G. Let AB = {ab,ae A, b € B} assum- 
ing all these products ab are defined. Define A“ similarly. It is an easy conse- 
quence of the relation 11 = 1, the continuity of ab, a’ and the properties of 
Hausdorff spaces that, given a nucleus V, there exists a nucleus U such that 
UU Cc V,U" CV (then of course UU C V, U* C V).—Every nucleus V 
contains a symmetric nucleus W (W-' = W). In fact, take W = VN me 

Let U, V be nuclei with UU C V. Using the associative law, we can define, 
as in the case of ordinary groups, products of all sets of three elements in U; 
evidently U can be so chosen that UUU C V. In fact, for given n, U can be 
so chosen that the n-fold product U --- U is defined and in V. 

If A, B are subsets of a local group and if A is open, it is easy to show that 
AB (if defined) is open. 

Let V, V; be nuclei of local groups G, G,. Let U be a nucleus of G such that 
UU CV,U" Vand let U; bea similar nucleus in A continuous mapping 
= U, such that 7(ab) = is a local homomorphism G — G,. In 
particular 7 is a local isomorphism G — G; if the mapping rU = U, is a homeo- 
morphism. It is obvious that a local homomorphism carries the identity into 
the identity and (in the neighborhood of the identity) inverses into inverses. 
The relation of local isomorphism is reflexive, symmetric, transitive. 


a su 
in V 
equée 
that 
= 
PE 
We s 
is Op 
Le 
close 
f(wi) 
Co 
where 
Now 
wie) 
accor 


Now | 
is ope 
(1.2 
(1.1) t 
elemer 
subgro 
only te 
nucleu: 
Xo = fi 
f (Xoyo) 
T-inver 
For 
Topolog 


A loe 
morphic 
A loe: 
of 1. 7 


I 
der 
if tl 
elen 
and 
T 


FOUNDATION OF THEORY OF LIE GROUPS 483 


Let G and H be local groups and assume that H is a closed subset of G and 
derives its topology from that of G. If the identities of H and G coincide and 
if the product of two elements of H, whenever defined, is the same as it is if the 
elements are regarded as elements of G, (provided this product is also defined) 
and if the same is true concerning inverses, we call H a local subgroup of G. 

THEOREM (1.1). Let G, Go be local groups, f a mapping of a nucleus of G into 
a subset of Go. Assume that there exists a nucleus V of G such that for a, b 
in V, f(ab) and f(a)f(b) are defined and equal, f(a’), (f(a))~* are defined and 
equal. Then there exists a nucleus U such that f is defined over U and such 
that under f, the images of sufficiently small nuclei of G are open in the set 
= f(U). 

Proor. Let X be a nucleus with XX C V, U a nucleus with UU Cc X. 
We shall show that if W is a nucleus such that W C U and W™ C U, then f(W) 
is open in the set T = f(U). 

Let J consist of those elements i of X such that f(i) = 1,. Evidently I is 
closed. Since WI C UX C XX CV, f(WI) is defined. If we W,ieJ, then 
f(wi) = f(w)f() = fw) so that f(WI) = f(W). 

Consider an element a in U such that f(a) ef(WI). Then f(a) = f(wj) say 
where we W,jeJI. Hence f(a) = f(w), and hence f(w a) = (f(w))“f(a) = 1. 
Now w aeW'U C UU C X. Hence wael, say w'a = i. Thena = 
wie WI. In short, for elements a in U, “f(a) ef(WI)” implies “a e WI’, and 
accordingly, those elements not in WIJ are carried by f into elements not in 

(WI). Hence 


Now W being open, WJ is open. Hence U — UN WIT is closed. Hence f(W) 
is open in T. 

(1.2) Application. Define the “product” of two elements of T in theorem 
(1.1) to be their product as elements of Go whenever this product exists and is an 
element of T. Similarly for inverses. Then we assert that IT becomes a local 
subgroup of G) and f defines a local homomorphism G — [T. We have in fact 
only to show that T has a nucleus and that this nucleus is an image under f of a 
nucleus in G. Let W be a nucleus of G such that VW C U,W* CU. Let 
xv = f(x), yo = f(y) (x, y € W) be arbitrary elements of the open set f(W). Then 
f(xoyo) = f(xy) ef(U) C T so that T-products are defined over f(W). So are 
l-inverses. 

For more details concerning local groups and subgroups see Pontrjagin, 
Topological Groups. 


2. Coordinate systems 


A local group G@ is locally euclidean r-dimensional if some nucleus is homeo- 
morphic to euclidean 7r-space E, . 

A local euclidean local group G admits coordinate systems in the neighborhood 
of 1. The most convenient situation is of course that in which there are co- 


| 
L 
r 
t 
1 
f 
f 
1- 
om 
of 
at 
V 
e, 
be 
at 
at 
ng 
Is 
e0- 
ito 


484 P. A. SMITH 


ordinate systems covering the whole of G. For this reason we shall consider 
exclusively what we shall call r-parameter local groups, namely local groups 
which, as spaces, are homeomorphic to E,. We denote such local groups ge- 
nerically by G,. Aside from questions of convenience, justification for intro. 
ducing the G,’s lies in the obvious fact that every locally euclidean r-dimensional 
local group is locally isomorphic to a G, . 

A coordinate system in G, will always mean a coordinate system covering the 
whole of G, ,—that is, a coordinate system defined by a homeomorphism between 
E, and the whole of G,. It will always be understood that the origin of a co- 
ordinate system is at 1. In a given coordinate system the i coordinate of an 
element will always be denoted by a superscript, as a‘, (ab)*, (a’)’. The element 
1 in a G, with a coordinate system will always be denoted by O, the symbol for 
the vector (0, --- , 0). 

A coordinate system for G, having been chosen, it will be convenient to regard 
the elements of G, as vectors which can be added to each other and multiplied 
by scalars. We introduce also a modulus: |a| = (=(a‘)’)'?. These vector 
functions depend of course on the given coordinate system and have no invarian- 
tive significance. 

A function of one or more elements in G, is generally understood to have 
values which are elements in G,. In a given coordinate system, equations be- 
tween functions and their derivatives stand for a set of equations corresponding 
to the different coordinates. Thus, f(a, b) = c means f‘(a, b) = c'; 8(ab)/da’ = 
c means O(ab)'/da’ = (i = 1, 7). 

We shall presently be dealing with single-valued functions f(a, b, --- c) where 
some of the variables are real variables, some are elements in a G, with definitely 
chosen coordinate system, and where the values of f are realorinG,. To avoid 
unessential details we shall not always specify the domain of existence of such a 
function. But in any case this domain will always contain a domain D;:|a|< 
5,---,|ce| <6. We shall say that f is locally continuous if it is continuous in 
the totality of its variables over some D;. The same definition is to apply to 
functions f(a, --- , c; e) when e is an arbitrary unit vector in G, , except that the 
relation | e| = 1 is to be adjoined to the definitions of D; . 

A coordinate system = for G, is left-differentiable if the functions 


(ab)’ = fi(a’, a"; bY, b") 


possesses locally continuous derivatives with respect to a',---, a’. is of 
class k(k = 1) if the f' possess locally continuous derivatives of orders less than 
or equal to k with respect to the a’s and b’s. 2 is analytic if there is a 6 > O such 
that the f' can be represented as power series which converge for |a| < 4, 
< 6. 


3. Left differentiable coordinates 


Consider a G, with a left-differentiable coordinate system. Let e be an arbi- 
trary unit vector, s a real variable, and let d/ds) denote the s-derivative at s = 0. 


Q is d 
tinuov 
contin 
with r 


Let 
(3.1 
(3.2 
(3.3) 

Left 
Evid 
(3.4) 
Putt 

(3.5) 
Wer 

(3.6) 
(3.7) 

where 
Pr 

(3.8) 

where 
formly 

s=0 

(3.9) 


FOUNDATION OF THEORY OF LIE GROUPS 485 


Let 

(3.1) = (222) 

3.2) = ((se)b) = 
die) =  ((seyo) = 


Left differentiability implies that these three functions are locally continuous. 
Evidently 


GA) ¥(0, b; €) = €) = lim ((ae)b — 
Putting b = O gives 
(3.5) ¥(O;e) =e. 
We note also that 
(3.6) ¥;(O) = ( dal 


(3.7) In a left differentiable coordinate system 
ab =a+b+ |a|F@, b) 


where F is locally continuous and F(O, O) = O. 
Proor. Let 


Q(t, 8, be) = Lue + the)b — (se)b] — ¥(s,b;e), #0; 
(3.8) t 


2(0, s, b;e) = 0. 


Q is defined over some domain D;:|s|,|t|,|6| < 6, |e] = 1, and is con- 
tinuous, evidently, at the places where.t ~ 0. We assert that Q is locally 
continuous. It is sufficient to show that as ¢ — 0, 2 converges to O unifermly 
with respect to s, b, e when | s| < 6,|b| <6,|e| = 1. We have 


stt 


where s < Ss Since ¥(s, b; e) is locally continuous, (1/t)[ is uni- 
formly near ¥(s, b; e) when | ¢| is small, which proves our assertion. Now put 
8s = 0 in (3.8), multiply by ¢ and use (3.4). We obtain 


(3.9) (te)b = b + th(b; e) + 0, b; e). 
By (3.5) and the local continuity of ¥(b; e), we may write ¥(b; e) = e + E(b; e) 


il 
n 
)- 
it 
or 
rd 
ad 
or 
n- 
ve 
ng 
vid 
1a 
‘in 
to 
1an 
ich 
= 0. 


486 P. A. SMITH 


where £ is locally continuous and E(O; e) = O. On substituting into (3.9) 
we obtain 


(te)b = te + b + tE(b; e) + 0, e) 
= te+ b + tH’ (t, b; e) 


where E’ is locally continuous and E’(0, O; e) = O. On putting a = te and 
allowing only non-negative values of ¢, so that |a| = ¢, (3.10) becomes ab = 
a+b+|a| F(a, b; e) where F is locally continuous and F(O,0;e) = 0. Evi- 
dently e cannot enter the function F effectively; hence it can be omitted. This 
completes the proof of (3.7). 

From (3.7) we obtain: a = 2a + (a)F(a, a), a®° = ad =~a+a + 
|a| F(a, a’) = 3a + |a| (F(a, a) + F(a, a’)), and so on., Hence 


(3.11) = na + |a| (F(a, a) + F(a, a’) + --- + F(a,a”™”)). 


All the terms in this formula are well defined if a is restricted to a sufficiently 
small nucleus. In case n = 2, we may write simply 


(3.12) a = 2a + |a| F(a), 


where F(a) is locally continuous and F(O) = O 
Putting b = a” in (3.7), we obtain 
(3.13) 
4. Analytic coordinates 


The constant terms in the power series expansions for the functions (ab)’ 
in an analytic coordinate system must vanish since (0, 0)’ = 0. To compute 
the linear terms, we note that (dab)'/da’)a-0, <0 = 5} (see (3.2)). By symme- 
try, the same relations hold if a’ is replaced by b’. Hence we have 

TueoreEM (4.1). Jn an analytic coordinate system the power series which define 
ab are of the form 


(3.10) 


(ab)? =a’? =1,---,7) 


where the dots stand for terms of degree greater than 1. 

TurorEeM (4.2). Relative to an analytic coordinate system for G, 
let Ge, Gu (h +k +--+ +m =r) be linear subspaces which contain O 
and together spanG,. There exists a nucleus V such that if a, b, -- + c are arbitrary 
elements in VN --- VN G, , the correspondence 


is a homeomorphism. Every element near O is therefore uniquely expressible in 


the form ab---c. 
Suppose for example that we have two spaces, say G, , G, with h + k =r. 


We may perfectly well assume that @; is the (a1, --- , @)-coordinate space, 


0, 
fun 
im! 

} 

| 
| I 
rela 
(5.1 
holc 
mea 
(5.2 
as t 
(5 
It 
othe 
Tt 
cano 
PE 
(5.5) 
wher 
ment 
missi 
65.6) 
where 
r(a)/ 
and | 

Hence 
proacl 
THI 


FOUNDATION OF THEORY OF LIE GROUPS 487 


its linear complement. Then a’ = (a’,---, a”, 0,---,0), a” = (0,---, 
0, a”) are arbitrary elements of the product c = a’a” is a 
function of (a, --- ,a’) = a’ +a”. It is sufficient to show that the jacobian 


matrix J(a) = (dc'/ da’) is non-singular at a = O. With the aid of (4.1) we find 
immediately that J(O) is the identity matrix. 


5. Canonical coordinates 


Let t, s be real variables. A coordinate system = for G, is canonical if the 
relations 


dto 
hold in some neighborhood of a = O, b = O,t = 0. Here, as always, d/dto 


means the derivative at ¢ = 0. 
We note that (5.1). is always implied by left-differentiability. For, by (3.7), 


(5.1) (ta)(sa) = (¢ + s)a; ((ta)(tb)) =a+b 


(5.2) ((ta)(¢b)) =a+b+|a| F(t, a+b 


as t — 0. Hence 

(5.3) A left differentiable coordinate system in which (5.1), holds is canonical. 

It is evident that if two coordinate systems for G, are linearly related to each 
other and if one is canonical, so is the other. A sort of converse is ° 

TuroreM (5.4). A local homomorphism +: G, — Gi, expressed in terms of 
canonical coordinates for G, and G , is linear over some nucleus of G, . 

Proor. The theorem asserts that there exists a nucleus V of G, such that 


(5.5) r(la) = tr(a), + b) = r(a) + 7(d) 


whenever a, b, faarein V. To simplify the exposition we shall make no further 
mention of the domains in which functions and operations are defined. (The 
missing details are not far from trivial.)—Since coordinates are canonical. 


(5.6) t(nb) = 7(b") = (r(b))” = nr(b) 


where 7 is a positive or negative integer. Writing a = nb, we obtain r(a/n) = 
t(a)/n where a/n means (1/n)a. If we now apply (5.6) with n replaced by m 


and b by a/n we get 
a) =" 
n n 


Hence (5.5); follows by continuity. Suppose that under the correspondence 
7,a—aandb—b. By what we have just proved, ta — tay ,th—> tho. Hence 
(ta)(tb) — (tao)(tbo) and hence ‘((ta)(tb)) t*((tao)(tbo)).. On letting ap- 
proach 0, we have, using (5.1)2, a + b — ao + bo which is (5.5): . 

THEOREM (5.7). Let G, be a local group with an analytic canonical coordinate 


| 


488 P. A. SMITH 


system. Let H be a local subgroup of G,. There exists in G, a linear subspace 
possibly 0-dimensional, and a nucleus W such tht@NwW=HNW. Buvi- 
dently © may be regarded as a local subgroup of G, , locally identical with H. Any 
coordinate system for © which is linearly related to that of G, is an analytic canonical 
system for ©. 

Proor. The second part of the theorem is evident once the existence of & 
and W is established. 

Let V be a spherical nucleus of G, so small that VM H is a nucleus of H. Let 
\ be a line through O such that the segment \ /M V is contained in H. Let © 
be the point-set union of the lines ) if any exist, otherwise let © = {0}. 
assert that G is a linear subspace of G,. It will be sufficient to show that when 
@ does not merely consist of O, the sum of two elements of @ near O is an ele- 
ment of G. Let U be a nucleus so small that products and vector sums of pairs 
of elementsin U areinV. Leta,bbeelementsin@NU. Thena+be@NV. 
Since a, b are also in H 1) U, we have (a/n)(b/n) « H 1M V where a/n means 
(1/n)a. From (5.1): , n((a/n)(b/n)) is arbitrarily near a + b if nis large enough. 
Hence for large n the elements m((a/n)(b/n)), m = —n, —n + 1,°--,Xn, 
lie close together along a linear segment beginning near —(a + b) and ending 
neara +b. Hence the segment o joining —(a + b) toa + b consists of cluster 
points of points m((a/n)(b/n)). Coordinates being canonical, these points are 
mth powers of elements (a/n)(b/n) which are in H, hence they themselves, as 
well as the points of o, arein H. Hence a + be G. 

We can now prove that © NM W = HN| W for some nucleus W. Suppose no 
such W exists. Then there exists a sequence {a;} converging to O, where a; 
is in H but not in @. Let G’ be the linear complement of G and write a; = gig; 
(g G, g’ in accordance with (4.2). (Incase G = {0}, take g; = O, g; = ai). 
Since g; — O, the positive and negative multiples of 9: , 2 large, lie close together 
along a line through O. Hence there is a line » through O consisting of cluster 
points of the set {mg;} (m = +1, +2,---,i = 1,2,---). Since these points 
are in &’, so isu. We shall show however that u is in G, a contradiction which 
pe the proof of the theorem. Let p bea point of nN V. We may write 

= lim m,g; asj > ~, where m; — ~ and {g;} is a subsequence of {g;}. Since 
os e H and since g; —> 0 so that g; «H when j is large, elements 9; = g; a; with 
large j arein H. Hence for large j, mig; = (g;)"'e¢H. HencepeHNVCS. 


6. The full matric group 


We recall the following elementary facts about real square matrices. Let 
X = (X”’) be such a matrix. The series 


+ + 


where J is the identity matrix and ¢ a real variable, converges for every X and 
t. Evidently the elements (e'*)’’ are power series in the variables X”’, ) t, con- 


M’ a: 


ver 
tha 
| Y is 
Sin 
mat 
ord: 
sub: 
fun 
T 
ana 
term 
M2 
P. 
posi 
natu 
spac 
the « 
a ho 
mati 
whic 
there 
With 
M’ is 
The 
local 
and a 


et 


ad 


FOUNDATION OF THEORY OF LIE GROUPS 489 


vergent for all values of these variables. It can be seen by direct substitution 
that Y = e’* satisfies the equation 

Y is, in fact, that (unique) solution which satisfies the initial condition Y(0) = J. 
Since Y(t + 8) and Y(t) Y(s) are solutions which equal Y(s) when t = 0, we have: 
ft «<"". 

Now let M’ = {I, A, ---} be the group of real r-rowed non-singular square 
matrices. An element A of M’ may be regarded as a point in E,2 with co- 
ordinates A”, A”, --- , A” written in some definite order. M” is thus an open 
subset of £,2 and may itself be regarded as a space, subspace of E,2. Since the 
functions AB, A’ are continuous, M’ is a topological group. We call 
A",---,A™ the natural coordinates of A in M’. 

TuEorEM (6.1). There exists an r-parameter local group M,2 which has an 
analytical canonical coordinate system =, and is locally isomorphic to M’. In 
terms of = and the natural coordinates of M", there is a local isomorphism 7: M” — 
M,2 which is analytic in the neighborhood of the identity. 

Proor. Let E(;;) be the r-rowed matrix which contains a 1 in the (i, j)- 


position and zeros elsewhere. Let X(x) = ec” where E =  2"EG;. The 
natural coordinates X”‘(x) are everywhere convergent power series in 
ll rr ll rr 

eyc++,2'. Now let x,---,2” be regarded as coordinates in a euclidean 


space M,2. The analytic correspondence o: x — X(x) maps a neighborhood of 
the origin of M,2 into a point set in M’ containing J. This mapping is in fact 
a homeomorphism. To show this it is sufficient to show that the jacobian 
matrix of o is non-singular at x = 0. We have 


which equals 1, when (77) = (pq), zero otherwise. The matrix in question is 
therefore the r’-rowed identity at « = 0.—Let 7: X(x) — a be the inverse of o. 
With the aid of + we carry group products of elements in a suitable nucleus of 
M’ isomorphically over into M,2 converting M,. into a local r’-parameter group. 
The coordinate system =: 2”,---,2” in M,2 has the required properties. 
Analyticity of = follows from the analyticity of 7 and the fact that products in 
M’ are defined by polynomials in the natural coordinates. That E is canonical 
follows from (5.3) and the identity e'*e** = e'*”*. 


7. The function a ‘ba 


This function of a and b plays an important part in the analytical theory of 
local groups. 

TurorEeM (7.1). In canonical coordinates the function a”‘ba is linear in b 
and analytic in a. More precisely, there exists a 5 > 0 such that for |a| < 6, 


y 

ul 

t 

e 

n 

1S 

4 

e 

10 

). 

ar 

ts 

te 

ce 

th 

OF 

n- 


490 P. A. SMITH 


|b | < 6, the coordinates (a~'ba)' are linear functions of b’, --- , b” with coefficients 
which are power series in a’, --+ , a’ convergent for |a| < 6. 

Proor. Let f(t) = a ‘(tb)a where a, b are fixed elements near O, ¢ a real 
variable. The points f(t) with —1 < ¢ S 1 constitute a simple are y through 0, 
Since coordinates in G, are canonical f(s)f(t) = f(s + 4), (@)? = S(—2), for 
small t, s. Hence y is a local subgroup of G,. By (5.7), y may be regarded as 
imbedded in a local subgroup G which is a line through O. We may write 
G = {se} where e is a unit vectorin G. The real parameter s is a canonical co- 
ordinate for G. Now let ¢ be regarded as a canonical coordinate for R, the 
one-parameter additive group of real numbers. Evidently f(t) defines a local 
isomorphism R — G. In terms of s and ¢, this isomorphism is given by s = Kt, 
k #0 (5.4). Hence f(t) = kte. In particular, f(1) = ke so that f(t) = ¢(1), 
or a(tb)a’ = t(a'ba), proving that a “ba is homogeneous in b. 

We next show that a “ba is additive in b. Let b’ = a ‘ba, c’ = a ‘ca. Using 
(5.1). and the homogeneity just established we have, as t — 0, 


= lim = lim 5 


t 
=a ‘(b+ cha. 


We have now shown that for given a, a ‘ba is linear in b. Let A(a) be the 
matrix of coefficients of the transformation b — a ‘ba. We show that the 
elements of A(a) are convergent power series in a’,---,a’. Since A(ac) = 
A(a)A(c), the matrices A(a), with | a| small, are elements of the full matric 
group M’. Consider the local group M,2 with canonical coordinates x", «++ , 2” 
and the local analytic isomorphism +: M” — M,2 of theorem (6.1). Let o(a) = 
7(A(a)). The set Tl = {o(a),|a| < 6} is defined and closed (in M,2) if 6 is 
sufficiently small. Since o(ac) = o(a)o(c) and = o(a))’, T is a local 
subgroup of M,2 according to (1.1) and (1.2), and o defines a local homo- 
morphism, G, > I. By theorem (5.7), I coincides, in the neighborhood of its 
identity, with a linear subspace N of M,2. Hence I may be considered as 
imbedded in N, hence admits canonical coordinates y,-*:,y' linearly related 
toa, +--+, 2” (5.7), say = L(y). Now is given say by 

where the y’ are convergent power series and the A” are natural coordinates 
in M’. Let x(a) be the x‘-coordinates of o(a), y'(a) the y'-coordinates of 
o(a). Then = L'(y'(a), , y"(a)) and 
Now in terms of a’, --- , a" and y’, -:-, y’, o is linear (5.4). Hence y'(a) is 
linear in a’,---,a" and hence the A“(a) are convergent power series in 


a',-+-,a’. This completes the proof. 


= lim = = lim (tb) te))a) 


i 
fol 
(7. 
Ey 
He 
(7. 
7 
Ssysi 
whe 
imp 
(§ 
a, 
The 
(call 
Hen 
Py 
Pn = 
Since 
| Pn 


FOUNDATION OF THEORY OF LIE GROUPS 491 


We obtain an explicit expansion for a ‘ba. Denote by %; the linear trans- 
formation b — (ta) 'b(ta). Evidently = Let 


_@ — 
(7.2) la, b] dso = ds» (sa) b(sa). 
Evidently [a, b] is linear in b. Let A be the linear transformation b — [a, }]. 


We have 


dA,b 
dt 


Hence (see 5.6) % = e'“. Writing %, = Y, we have 


= Tim (ys — = lim — 1) = Ad. 


(7.3) a ‘ba = %b = e4b = b + [ab] +5, [alab]] + ---. 


The function [a, b], linear in }, is called the commutator of a, b. We shall see 
later that [a, b] is also linear in a. 
8. Existence of the 1-parameter subgroups a(t) 


The propositions in this section concern a G, with left-differentiable coordinate 
system. V being a nucleus of G, let 


F(V) = lub. {F(a, b), ae V, be V} 


where F is defined in (3.7). Local continuity of F and the relation F(O,O) = O 
imply that F(V) — 0 as diameter V — 0. 

(8.1) Let Vs be a spherical nucleus of radius 6 such that F(Vs;) < 1. Let 
a, @,--- be a sequence of elements in V; such that |a,| < (6/8)(3/4)”. 
Then the products 


(call them p; , p2 , are defined and 


< 


Hence in particular p, € Vs. 


Proor. Assume that p,-; is defined and satisfies the inequality. Then 
Pn = QnPn—1 is defined and by (3.7) 
Pr = An + Pr- + | Gn | F(a, Pn—1)- 
Since | F(a, , | <1, | pn — < 2|an| < (6/4)(3/4)”. Hence 


6(/3\" 6 3 3\"" 


all 
0, 
or 
as 
ite 
'0- 
al 
kt, 
| 
ng 

he 
he 
ric 
is 
n0o- 
as 
ted 
tes 
of 


492 P. A. SMITH 


Let Z denote a definitely chosen nucleus such that F(Z) < 1/4. 
(8.2) If O # aeZ, then (i) |a| S 3/4|a’ | and (ii) the angle @ between the 


vectors aa’ and a, 2a is less than a right angle. 
Proor. Using (3.12) and the fact that F(Z) < 1/4 < 2/3 we have 


| 2a| + 


when ae Z. Transposing the last term and multiplying by 3/4 gives (i). As 
for (ii), we have | a” — 2a| < | a| (3.12). Hence in the triangle formed by the 
vectors in question, the side opposite @ is shorter than one side adjacent to @, 
Hence (ii) is true. 

(8.3) Not every power of an element in Z isin Z. For ifaeZandn— a, 


ja" | 2 


(8.4) Every element a in Z has a square root a’” such that | a” | < (3/4) | a |. 
Proor. Leto be the mapping a—a’. If bis a point on the spherical bound- 


ary of Z, then b ¥ ob and the vector b, ob makes an acute angle with the outward 
normal vector at b (8.2) from which it follows (see appendix) that Z C o(Z),. 
Hence every element a in Z is the square of at least one element a” in Z. The 
stated inequality for a” follows from (8.2). 

We shall call a” a principal square root of a if | a” | < |a@|. Proposition 
(8.4) implies that every element in Z has a principal square root in Z. 

(8.5) Z contains a nucleus W the principal square roots of whose elements 


are unique. 
Proor. Let Z’, Z’’ be symmetric nuclei such that Z’’ C Z’, Z' CZ. Let 


aeZ’—Z"}. 


Evidently M > 0, otherwise Z would contain an element of order 2 which 
(8.2) shows to be impossible. Now choose a symmetric nucleus Y C Z” such 
that |z — yey | < MwhenzeZ,yeY. The successive powers of an element 
y in Y lie more closely together as y is nearer O. Hence if Y is small enough, 
the first element in the sequence y, y’, --- which is not in Z’’—there will be 
such an element by (8.3)—must lie in Z’ — Z’’. Assume Y to be so chosen. 
Let W be aspherical nucleus W C Y. We assert that the principal square roots 
of elements in W are unique. If not, there exist elements a, b, c in W such that 
a =b =c,axb. Leth = ab’. Since he Y, the smallest power h” of h 
which fails to be in Z” will be in Z’ — Z”. Nowa = hb so that b* =a = 
hbhb, h = b "hb, h" = Since Z’, Z” are symmetric, Z’ — Z". 
Hence |h” — > M. But since be Y, — < M, a contra- 
diction. 

In the following proposition V. denotes a spherical nucleus of radius e. 

(8.6) Let 6 be a number not exceeding the radius of W. For every element 
ain V5 s there is a uniquely determined function a(t) whose values are elements 


Sin 


i 
a(s 
] 
in 
I 
is d 
by 
A 
N 
val 
on t 
is si 
W 
the 
with 
Hen 
W 
Fror 
inde 
| a(p 
by e 
sing] 
dent 
(0, 1 
It 
the 
Fron 
| a(h 
there 
We 
Le 
The | 


FOUNDATION OF THEORY OF LIE GROUPS 493 


in V;, defined for 0 S ¢ S 1 with a) = O, a(1) = a, and such that a(é)a(s) = 
a(s +t). Ifa # O, the are defined by a(¢) is simple. 

Proor. Let abe an element in and let be the unique principal square 
root of a. Since a’” is also in Vg, it has a unique principal square root a™* 
in Vs, and so on. 


Let m, n be non-negative integers with m < 2”. We may write 


pad 
2" 
Since |a| < 6/4 and F(Vis) < F(Z) < 1/4, we have (8.4) 


asd = 
< \ a) 
Writing x = 2/2, y = 2/4, +++ it follows from (8.1) that 


(a")™ = (a%a*) ++) 


is defined and in V;. The preceding statements evidently hold if 6 is replaced 
by a smaller number. Hence we have proved the following proposition: 

A) If ais an element in V.g where e < 6 then (a’”")” is defined and in V,. 

Now let P be the set of rational numbers {m/2", m < 2"} dense on the inter- 
val (0, 1). We see readily that for given a, the element (a””")” depends only 
on the value of m/2”. Hence we write (a"")" = a". The function a(p) = a” 
is single-valued over P. Evidently a(p)a(q) = a(p + q). 

We assert that a(p) — O as p — 0. Take ¢€ > O and n so large 
that | a(1/2”") | < €/8. Now the values of p which are larger than 1/2” are of 
the form (1/2")(h/2*), 0 < h < 2 — 1. Hence we may apply proposition A) 
with a replaced by a(1/2"). We conclude that | a(p)| < ¢€ when p < 1/2”. 
Hence a(p) — O. 

We assert next that a(p) is uniformly continuous over P. For take e > 0. 
From the preceding paragraph and group continuity there exists a number u 
independent of p such that | a(p)a(q) -- a(p)| < «€ when gq < uw. Hence 
| a(p + q) — a(p) | < ewheng < uy, proving our assertion.—From this it follows, 
by elementary continuity considerations, that there exists a unique continuous 
single-valued function a(t)(0 1) such that a(p) = a(p) over Evi- 
dently the relation a(p)a(q) = a(p + q) extends itself to the whole interval 
(0, 1) by continuity. Moreover, a(t) « Vs since a(p) «V3. 

It remains only to be shown that a(t) # a(s) when t ¥ s,a ~ O. Suppose on 
the contrary that a(t + h) = a(t),h #0. Thena(t)a(h) = a(t) so that a(h) = O. 
From the relation |a| < 3/4|a°| we have | a(h/2)| | (a(h/2))’| = 
|a(h) | = 0. Hence a(h/2) = O and similarly a(h/2") = O. Taking n large 
there exists an m such that mh/2” is near 1, hence such that (a(h/2”))” is near a. 
We conclude that a =-O, a contradiction. 

Let U be the spherical nucleus whose radius is one eighth the radius of W. 
The function a(t) is defined for a e U, and the elements a(t) are in W. 


i 
1 
t 
h 
h 
it 
1, 
1. 
ts 
at 
h 
a- 
nt 
ts 


494 P. A. SMITH 


(8.7) There exists a positive number 7’ depending on a (a e U) such that 


a(t 


when 0 < ¢ S T. 

The proof of the first inequality is essentially the same as that of the second. 
Suppose the second is false, that is, suppose that | a(t;)/t; | > 2|a| for some a 
and some sequence ¢; > 0. For each 7 choose an integer n; such that 


Sus. 
Then nit; ~ lasi— «. If we refer to (3.7) and use the fact that F(U) < 1/4 
(since U Z) we have 
a(niti) = (a(t))™ 
+ | a(t) | a(t:)) + + F(a), a((mi — 
nia(t;) + n; | a(t:) | H, say, 


where | H | < 1/4. The right member of (8.8) is of the form b+ |b|c. Write 
b + |ble — |ble = b. Then |(b + |ble)| + |blle| = |b|, or 
|(6+|bje)| 2|b|(—|e]). Hence 


(8.8) 


a(t;) 


t 


(8.9) |a(nti)|2 | nia(ts) | (1— |H|) > | ma(t)| = 


Since nt; — 1 and a(t,;)/t; > 2|a|, the last expression in (8.9) becomes and 
remains greater than 5/4|a]| when 7 is large. This is impossible since 
a(niti) a. 

(8.10) If O # ae U, lim a(t)/t (t¢ > 0) exists and is different from O. 

Proor. By virtue of the preceding proposition it is sufficient to prove that 
if {s;}, {t;} are sequences of positive numbers converging to 0 and such that 
lim a(s;)/s; — A, lim a(t;)/t; ~ B, then A = B. On suppressing a number of 
the first terms in the sequence {s;} if necessary, we may suppose s; to be so 
small that | a(s:)/s: A | < | A — B| /3 and that 


(8.11) | F(a(t), a(t’))| <|A — B|/3 


when 0 S (t, t’) S s,. For each ¢; choose an integer n; such that 


ni 
1- S nt S 4%. 
a( 


Then nit; — 8; so that | a(nit;:) — a(s:)| — 0. Hence we may write 


(8.12) A| < ~ Bl 


W 


sequ 


w 

| J 
wl 
Fr 
ov 
(- 
By 
wh 
| an 
] 
to 
(a‘ 
roo 
the 
the 
for 
( 
For 
( 
a ri 
lim 


FOUNDATION OF THEORY OF LIE GROUPS 495 


On examining the definition of H in yom we see that (8.11) implies here that 
|H|<|A—B|/3. Since nit; — s, and a(t,)/t; > B, it follows that 


where e¢; — 0. Use of (8.8) gives 


= net + 


— = |A Bl +m 
where 7; ~ 0 asi — ©, Combine this with (8.12) to obtain 
2s 
|A — B| SZ lA-Bltatm. 


We conclude that | A — B| S (2/3) | A — B| which implies that A = B. 
9. Canonical mappings 


We continue to assume that the coordinate system for G, is left-differentiable. 
From now on we shall denote a(t) by a‘. For each a we extend the function a‘ 
over the interval (—1, +1) by the formula a‘ = (a‘)”. The function a‘ 
(—1 < t S 1) is defined for every a in a certain spherical nucleus U C W. 
By (8.6) we may assume U to be so small that a‘a’ and (a‘)’ are defined and in W 
when ae U and |t|,|s| <1 

(9.1) Letabeanelementin U. Thena‘a’ = 
and (a*)' = a“ when |s|,|¢| <1. In canonical coordinates, a‘ = ta. 

Proor. The first formula holds of course when s, ¢ are positive. Extension 
to the interval (—1, +1) is an immediate consequence of the relation a‘ = 
(a')*. Similarly it will be ‘sufficient to prove the second formula for positive 
t, s. Since a’? Z, we have |a‘”| < | = | a‘|. Hence is 
a principal square root of a‘. Since a‘ e W, a‘” is the unique principal square 
root of a‘, and hence equal to (a‘)* with s = 1/2. (See proof of (8.6)). Thus 
the formula in question holds when s = 1/2. Similarly it holds when s = 1/2", 
then for s = m/2”, hence, by continuity, for arbitrary s.—The relation a‘ = ta 
for canonical coordinates follows immediately from the proof of (8.6). 

(9.2) a‘ converges to O uniformly with respect to {(—1 S t S$ 1) whena—0O. 
For, by (8.6), a e V. implies a‘ « Vge + Vo. 

(9.3) The t-derivative of a‘ exists at t = 0. For, (8.10) implies that a‘ has 
aright derivative at t = 0. It is a consequence of (3.13) that for t ~ +0, 
lim a‘/t = lim (a‘)'/t = —lim a‘/t. Hence the left derivative exists at t = 0 
and equals the right derivative. 

_ We now note certain properties of the function 
t t 1 
A(a) = a = lim 5 = lim nas . 

(9.4) A(a) is dela’ over U and | A(a) | <= 2\a|. (This inequality is a con- 

sequence of (8.7).) 


| 
] 
d 
e 


496 P. A. SMITH 


Left differentiability of ab and differentiability of a° at s = 0 imply the dif- 
ferentiability of a*b with respect to s at s = 0 according to the formula 


ds» dai 
Putting b = a‘ we obtain 
i t t ots t da‘ 
(9.5) (A(a))*y:(a") = lim = (a’at — a‘) = lim = — a‘) = — 


so that a‘ is t-differentiable throughout. 
(9.6) There exists a function M(a) which is continuous at a = O with 
M(O) = O and is such that 


A(a) =a+|A(a) | M(a). 
Proor. We have only to show that (A(a) — a)/|A(a)| ~ Oasa — 0, 


From (9.5) we have 


(nla) = [ vice ar 


Hence 

| (tA(a) — a')} | 
where 


Mj = max {| 4; — yj(a’)|,0 <7 


Since ¥(b) is locally continuous and ¥i(O) = 4; it follows from (9.2) that for 
given t, M} > 0 ast—0. The limit in question is now verified by putting 
t = 1 in (9.7) and letting a > 0. 

Certain useful formulas come from (9.5). We have 


|A(a)| |a| + |A@) || M@)| 

so that | A(a)|/|a| < (1 — | M(a) |)". Hence 
1 + p(a) 
| a| 

where p(a) has real values and converges to 0 asa — O. From (9.6) we now 
obtain 

A(a)_ 

which, on multiplication by | a| becomes 
(9.8) A(a) = a + |a| M'(a) 
where M’(a) O asa — O. 


+ | (1 + e(@)) | 


(9.! 


see 
res] 
itse 
(9.1 
to 
(see 
(9.1 
Let 
tern 
the 
beec 
A 
(9.1 
Pp 
fron 
now 
W 
valu 
St 
in [ 
Writ 
prod 
A(a~ 
over 
(9 
coore 


wes 


FOUNDATION OF THEORY OF LIE GROUPS 497 


The formula 
(9.9) A(a‘) = tA(a) 
is a consequence of (9.1). For, 
t\s ts 
s—0 ts 


(9.10) A(a) is locally continuous. 

Proor. The correspondence a — a” is single-valued over U (since U C W, 
see (8.5)) and has a single-valued inverse, namely a — a’. Since this last cor- 
respondence is continuous and U is compact, the correspondence a — a” is 
itself continuous. Hence the functions a”, a/*,--- are continuous over U. 


Now from (9.9) and (9.6), 
A(a) = = + | A(a) | M(a"”) 


9.11 
= na!" + H(a,n) say. 


We assert that H(a, n) converges to O uniformly with respect to a as n converges 
to » through powers of 2. This is a consequence of the relations: | A(a) | < 2 |a| 
(see 9.4) and | a" | < (3/4)"|a| (8.2).—From (9.11) we have 


(9.12) | A@) — A) | S — BY" | + | H(a,n) — |. 
Let € be a positive number. Take for n such a large power of 2 that the last 


term in (9.12) is smaller than ¢«/2. Since a” is continuous, the first term in 
the right member of (9.12) becomes smaller than ¢€/2, hence | A(a) — A(b) | 
becomes smaller than ¢ when | a — D| is sufficiently small. 


Another formula: 
(9.13) ; A(a‘b') = A(a) + A(b). 
t—0 


Proor. Writing c, = a'‘b’, we note that lim c,/t = A(a) + A(b). This follows 
from the relation c, = a‘ + b' + | a‘| F(a‘, b) (3.7). The formula in question 
now comes from using (9.8) with c; replacing a. 

We shall say that the mapping x: a — A(a) is canonical if its inverse is single- 
valued. 

Suppose « is canonical. Then if U’ is any nucleus whose closure is contained 
in U, x maps U’ homeomorphically and «(U’) is an open set containing 0. 
Write a’ = A‘(a) and regard the a‘ as coordinates in a space G,. If we introduce 
products and inverses into G, by the formulas A(a)A(b) = A(ab), (A(a))* = 
A(a™), then G, becomes a local group, « a local isomorphism G,—> G,. More- 
over, (9.9) and (9.13) imply that the coordinates a‘ are canonical. Hence 

(9.14) If « is canonical, G, is locally isomorphic to a local group with canonical 
coordinates. 


498 P. A. SMITH 


10. Vector fields 


Let R be an open set in a euclidean space S, with coordinates 2’, --- , 2”, 
A vector field & over R is a single-valued function £(x) or &x defined over R, 
with values in S,. To say that £ is of class k over R means that the k-fold 
derivatives of the functions £‘(z', --- , x”) exist and are continuous over R. 
We shall consider also the extreme cases in which £ is analytic, or merely con- 
tinuous, over R. 

Let & be a continuous vector field over R. For each point x in R there exists 
at least one solution f(z, t) of the system of equations dx'/dt = £‘(x) satisfying 
the initial condition f(z, 0) = x, and defined for all ¢ in a neighborhood of t = 0, 
Suppose that for every choice of x in R the solution f(z, ¢) is unique in the sense 
that if g(x, t) is also a solution with g(x, 0) = x, then g(x, t) = f(x, t) for all 
values of ¢ in some neighborhood of t = 0 depending on x. Then for given z, the 
solution curve through x can be continued in both directions in a unique manner, 
either indefinitely or until f(z, t) runs out of R. Thus for given z, f(z, t) is 
defined and single-valued over an interval —t; < t < ft where ¢; , fg are positive 
and one or both may be ». We shall say that ¢ possesses the wniqueness prop- 
erty over R and shall denote f(z, t) by ex. The solution e‘*z is additive in t: 
ee*r = ety By standard existence theorems, vector fields which are 
analytic or of class k, k = 1, possess the uniqueness property. 

Let & , --- & be vector fields of class k = 1 over R and let a, --- , a’ be real 
variables. Let R’ be a bounded open set such that R’ C R and let K be a posi- 
tive number. There exists a number 5 > 0 such that e“?'®*""'**"*z is defined 
force R’, |t| <6, {a| < K (a = a, ---, a’). Hence = 1) 


r 


is defined for x € R’ and |a| < 6K. Moreover, as a function of a’, ---,d, 


a", bre is of class k; it is analytic if the are analytic. For 
each a with |a| < 6K, the transformation « > e**""***rz transforms R’ 
homeomorphically. 


Suppose that £ is of class 1. Then 


d _ d _ 

(10.1) E(e°x); E(x); 


Formulas (10.1) are implied by the definitions they involve. Formula (10.2) 
is what we obtain by differentiating first with respect to t, then with respect to 
x’. Since the resulting function is continuous in (¢, x) and since the same is 
true of the x‘-derivative of e‘‘x it follows from a standard theorem that the 
order of differentiation can be reversed. 

We shall say that the vector fields & , --- , & are linearly independent over R 
if there exist no real numbers ¢,, --- , ¢, not all zero and no open subset R’ of 


R such that z ciéi(x) = O identically over R’. 


Fre 
me: 

J 

Su 

tint 

the 

are 

U 

ae 

rea 

tha 

ovel 

nar 

71 

me 

( 

sys 

G, , 

ove 

dey 

fun 

Le 
(11 

The 

for 
stri 

sys 

loc: 

(11. 

defi 

are 

F 

exis 

ord 

We 


FOUNDATION OF THEORY OF LIE GROUPS 499 


11. Realizations 


The continuous groups of the classical theory are transformation groups. 
From our point of view they may be regarded as realizations of local groups by 
means of transformations. 

Derinition. Let G = {1, a, b,---} be a local group, S = {x, y, ---} space. 
Suppose there exgts a function a-x (ae G, x € S) which is single-valued and con- 
tinuous in the pair (a, x) wherever defined, has points of S as values, and satisfies 
the following conditions: (1) whenever defined, 1-x equals x; (2) if a-(b-x), (ab)-x 
are defined, they are equal; (3) a-x equals a-y only if x = y; (4) there is a nucleus 
U of G and an open set R in S such that U-R is defined (i.e. a-x is defined when 
aeU, xeU). We then say that there is defined a realization (G, S) of G. A 
realization (G, S) is effective if there exists a nucleus Uo and open set Ry in S such 
that Uo-Ro is defined and such that the only element a in Uo such that a-x = x 
over Ryo is a = 1. For every local group G there is an effective realization (G, S) 
namely the regular realization obtained by taking G = S and defining a-b to be ab. 

The following theorem and proof (Lie) are classical. We add here an ele- 
ment of precision required for our purposes. 

(11.1) Let G,, S, be euclidean spaces of dimensions r, n with coordinate 
systems a’, --- ,@” and 2',---,2”. Let U be a neighborhood of the origin of 
G,, R an open subset of S,. Let di, --- , ¢, be vector fields of class k (k 2 1) 
over U such that the matrix (¢}(0)) is non-singular. Let &, --+ , £ be linearly 
dependent set of continuous vector fields over R such that every vector field 
>= a'é; possesses the uniqueness property over R. Let a-2x be a single-valued 
function defined for a e U, x ¢ R with values in S, and such that O-x = x when 
aeR. Assume that the derivatives 0(a-x)‘/da’ exist and that over U, R, 


a(a-x)' 


Then there exists a function ab which converts G, into a local r-parameter group 
for which a-x defines an effective realization (G,, S,). The group-theoretic 
structure of G, is determined solely by the vector fields ¢;. The coordinate 


(11.1a) 


system a’, --- , a’ of G, is of class k. Moreover G, is locally isomorphic to a 
local group G, with canonical coordinates a’, --- , a’ of class k. The functions 


define an effective realization (G,, S,). The coordinate systems of G, and G, 
are analytic if the ¢; are analytic. 

Proor. Let y be the inverse of the matrix @ = (¢;(a)). Evidently y(a) 
exists and is of class k in the neighborhood of a = O. Let a’, ---,a" be co- 
ordinates in a space G, and let 


x(a) = 


We assert that the formula a = x(a) defines a homeomorphism of class k be- 


ld 
R. 
ts 
ig 
e 
ll 
e 
r, 
is 
e 
fs 
e 
al 
d 
) 
) 
0 
is 
e 
R 


500 P. A. SMITH 


tween neighborhoods of the origins of G, and G,. To prove this, it is sufficient 
to show that the matrix (da'/da’) is non-singular at a = O. We have 


We introduce products ab and ab into G, and G, by the formulas 
ab = e**'¥ég2bViQ) Where a = x(a), b = x(b); 
ab = x"(ab). 


We shall see eventually that these functions convert G,, G, into locally iso- 
morphic local groups, and that (11.2) defines an effective realization of G,. 
The structure of G, is determined by the y; hence by the ¢;. The functions ab 
and ab are evidently of class & or analytic according as the ¢; are of class k or 
analytic. Moreover the coordinates a‘ will be canonical since 


(sa)(ta) = x1 x = (s + 


This shows incidentally that the origin of G, plays the part of the identity and 
that elements near it have inverses: a’ = —a. Therefore to show that G, 
is a local group it is sufficient to show that the associative law holds in the 
neighborhood of O. G, will then also be a local group locally isomorphic to 
G, thru the relation ab = x(ab). 

We first establish the identity 


(11.3) ax(b-x) = 


which holds for z in R and | a], | b| sufficiently small depending on x. If we 
replace a’ by fa’ in (11.3), the left member is that solution of the system dx/dt = 
p ® a‘é; which equals b-x when t = 0. But so is the right member. For if we 
write b, = e'**'*b and use the fact that >>; ¢)(b)yi(b) = 6; we obtain 


dt = a’ x) (bi) Pi (br) = da 


Since >> a’é; possesses the uniqueness property we conclude that (11.3) holds 
with the a° are replaced by ta’. In particular we obtain (11.3) by putting 
1. 

Let a = x(a), b = x(b). On putting b = O, (11.3) becomes 


(11.4) a*xx = x(a)-x = 
Then on replacing x by b-x in (11.4) we obtain with (11.3): 
(11.5) a-(b-x) = ar(b-x) = = = ab-x. 


By (11.4) we have equally well a*(b*z) = ab*x. This relation and (11.5) hold 
for x in R and for |a|,|b|,|a|,|b| sufficiently small depending on z. It 


Now 


latic 
hol 
W 
and 
wou 
a; 
that 
unit 
hap 
dep 
It 
ciat: 
botl 
(ab) 
Tha 
we 
loca 
that 

W 
if th 
C\l 
(1 
cona 
P 
and 
Fort 
bis: 
| 


FOUNDATION OF THEORY OF LIE GROUPS 501 


will be seen, in fact, that if R’ is a bounded open set such that R’ C R the re- 
lations in question, namely 


(11.6) a-(b-x) = ab-z, ax(bex) = ab+z, 


hold for eR! and |a|,|b|,|a|,|b| < B say, where 8 is independent of x. 

We assert next that there exists a positive number y such that if |a| < y 
and if atx = x identically over R’, thena = 0. If this were not the case, there 
would exist a sequence a; — 0 such that a;*x = x over R’. Then since x = 
a/+*x = n(a;*x), we conclude by an obvious limiting process (ef. proof of (5.7) 
that (te)*x = x over R’ for every t with | t | sufficiently small, e being a suitable 
unit vector. From (11.2) we have e“®'#*""*e"®)y = x over R’ which can only 
happen if >> e’é;(z) = 0 over R’ which is impossible since the &; are linearly in- 
dependent over R. 

It follows from this that if a*x = b*z over R’(a, b, near O), thena = b. For, 
(b'a)*x = b ‘(a*x) = x, hence b ‘a = O,b = a. From this comes the asso- 
ciative law. For if a, b, c are near O, then (a(be))*x = ((ab)c)*z over R’ since 
both sides can be reduced to a*(b*(c*x)) by applications of (11.6). Hence 
(ab)c = a(bc).—-Thus G, is a local group for which a*x defines a realization. 
That this realization is effective follows from the preceding paragraph. As 
we have already remarked, the function ab converts G, into a local group 
locally isomorphic with G,. From the relations a-~ = a*x (11.4) it follows 
that the functions a-zx define an effective realization of G, . 


12. The Lipschitz condition 


We shall say that a coordinate system for G, satisfies a right Lipschitz condition 
if there exists a nucleus V and a positive number C such that | ab — ac| < 
C\b — c| for a, b, ce V. 

(12.1) In a left differentiable coordinate system which satisfies a right Lipschitz 
condition, the mapping a — A(a) (§9) is canonical. Moreover, there exists a 
nucleus over which the vector fields y; (3.1) possess the uniqueness property. 

Proor. Let a be an element in the nucleus U (domain of definition of A(a)) 
and consider the system 


Formula (9.5) shows that a‘ is a solution of (12.2). We assert moreover that if 
b is an element in U, then a‘ is also a solution. For 


d 1 tte, 8 
(12.3) b= lim (a b—ab) 
= lim (a°(a'b) — a'b) = = a‘y,(a'b). 
s—0 
Now suppose that f(¢) is a solution which assumes the same initial value as 
a‘b—namely b—when t = 0. We shall show that f(t) = a‘b, or what amounts 


it 
b 
r 
1 

3 


502 P. A. SMITH 


to the same thing, that the function g(t) = (a‘b) f(t) equals O for all values of { 
sufficiently near ¢ = 0. For this it is sufficient to show that the derivative of 
g(t) exists and equals O identically in the neighborhood of t = 0. We have 


Ag = + 8) — g)| = + 8) — “afl | 
<C lft +s) — | 


provided |¢| and | s| are not too large. From (12.2) we have 


a=0 ds» 


Hence | Ag| < C|h| (s, t) say where e(s, t) ass—0. Hence Ag/s +0 
and therefore dg/dt = O. 

Suppose in particular that b = O. Then any solution of (12.2) which equals 
O when t = 0 is identical with a’ near t = 0. Suppose now that’ A(c) = A(a). 
Then c’ also satisfies (12.2) and hence c' = a‘ in a neighborhood of t = 0. We 
conclude (with the aid of (9.1)) that ec’ = a‘ throughout, hence that ¢ = a. It 
follows that the mapping a — A(aq) is canonical. 

We have shown that >> a'y; possesses the uniqueness property over U when a 
is the image of an element a under the mapping a — A(a). Since this mapping 
is canonical, we have established uniqueness when a is any element in a certain 
neighborhood of O. It is easy to see however that if uniqueness holds for > avi, 
then it holds also for where \ isa real number. Hence for arbitrary a, 
+> a'y; possesses the uniqueness property over U. 

We have still to show that the y; are linearly independent. From (12.3) with 
t = 1, we have 


ab = 


over a sufficiently small nucleus V. If the y; are not linearly independent over 
V, we may suppose that ¥, = + identically over some open 
subset V’ of V. Then for every a such that a = (0, ---, 0, a’) (| a’ | small) we 
have ab = b which is impossible. 


THEOREM (12.4) (van Kampen). A G, with a left-differentiable coordinate 
system satisfying a right Lipschitz condition is locally isomorphic to a local r-param- 
eter group with analytic canonical coordinates. 

Let 2’, --- , 2” be the coordinate system of G,. By (12.1) the mapping > 
x = A(x) is canonical and hence it establishes a local isomorphism between G, 
and a local group G, with canonical coordinates x’. Let x, y be elements in the 
nucleus U (existence domain of A(x)) and let x = A(x), y = A(y). Since x‘ = &, 
we have (7.1 and 7.3): 


—tX 


(tx) (sy) (—tx) = s (ey) = sy’, say 


whe! 
whe 


(12. 
Now 


vari 


Writ 
We 


whe! 


Since 
som 


in W 

(12. 

N 

that 

Mo. 

is a 

whi 

exis! 

In f: 

It fe 

mor 

a, - 

The 


of 


FOUNDATION OF THEORY OF LIE GROUPS 503 


where X is the linear transformation z — [x, z]. Hence (9.9), x‘y’x' = y’” 


where y’ = A ‘(y). Thus 

(12.5) x'y’ = where y’=e“y, y= A(y). 

Now let y = yi, --» y? where the y; are fixed elements in U and thea’ are real 
variables. Let x be an element in U. Then 


aP+s 


oa? 


Write yy? in place of y’ 20 th then transfer y* to the left, step by step using (12.5). 
We obtain 


where 
Az» Zp eon et” y 
in which X; is the linear transformation w — [y; , w], yi = A(yi). Hence 

Now if A is a small enough positive number, the elements y; can be so chosen 
that y; = d6}; let the y; be so fixed. Then y is a function @ of a’, --- , a” alone. 
Moreover the matrices of the transformations X; are constants and hence Z, 
is a convergent power series ¢, in a’ ---, a”. If, in the equations 

O(yx 


which (12.6) now become, we put x = O, we see that the derivatives dy/da’ 
exist and are continuous. The matrix (dy'/da’) is non-singular at a = O. 
In fact 


It follows that the correspondence a — @(a) = y maps a nucleus of G, homeo- 
’ eae onto a neighborhood of the origin of a space G; with coordinates 
a’, , a’. Let us define a function a-zx by the formula a-x = yx, y = (a). 
Then (12.7) becomes 


O(a-x) _ 
da" 


Since the y; are linearly independent and possess the uniqueness property over 
some nucleus of G, (12.1), there exists (11.1) a product function ab converting 


= > v‘(a-z)gi(a). 


= 
of 
ve 
)| 
0 
Is 
| 
le 
[t 
a 
ig 
in 
a, 
n 
te 
l- 
e 


504 P. A. SMITH 


G, into a local group for which a’, ---, a’ is an analytic coordinate system, 
Since (11.1) G’ is locally isomorphic to a local group with analytic canonical 
coordinates, it remains only to be shown that G’ is locally isomorphic to G,. 
The correspondence a — 6 (a) is in fact such an isomorphism. For, 


(A(ab))x = ab-x = (A(a)) (b-x) = (0(a)0(b))x 


identically in x. Hence 6(ab) = 6@(a) 6(b) which completes the proof. 

Let us note that according to (11.1) the structure of G, (hence that of G,) is 
determined by the vector fields ¢;. These in turn are determined solely by the 
transformations X; hence, ultimately by the function [x, y] over G,. Hence if 
we take G, = G = a local group with analytic canonical coordinates, we obtain 

THEOREM (12.8). The group-theoretic structure of a r-parameter local group 
with an analytic canonical coordinate system is completely determined by its com- 
mutator function. 

THEOREM (12.9). Every left-differentiable canonical coordinate system satisfying 
a right Lipschitz condition is analytic. Hence in such a coordinate system the 
vector fields y; are analytic. Products are given by the formula 


(12.10) ab=e"™ (a=a',---,a’). 


Proor. Let G, have a coordinate system ~ with the stated properties. G, 
is locally isomorphic to G) with analytic canonical coordinates 2’. By (5.4) the 
isomorphism G, — G), is linear in terms of = and >’. Hence > is analytie.— 
Since. = is canonical so that a‘ = ta (9.1), the canonical mapping applied to G, 


is the identity; that is, a = A(a) = a. Hence (12.3) becomes 
dt ab b) 


which implies (12.10). 
13. Vector fields of class 1 


Let £, » be vector fields of class 1 over an open set R in a space S,, with co- 


n 


ordinate system x’, ---, x". The continuous vector field 


Ox! dx? 


is the commutator of £, n. Evidently 


(13.1) [é, n] = — [n, €]. 
If é, n, [é, al, [n, $1, €] are of class 1, then 
(13.2) [é, [n, + [n, [f, él] + [f, [é, nl] = O 


The verification of this well-known identity is straightforward. 
We now study the transformation 


Hen 


wher 
(13.4 
We 
sufhic 
W 
right 
the f 
the | 
(13.2 
and 
W 
(13.¢ 
Forr 
(13.7 
and 
each 
with 
relat 
B 

(13.' 
Now 
q 


FOUNDATION OF THEORY OF LIE GROUPS 505 


té 


where £, 7 are of class 1 over R. Let z = e e'*x, y = ex. Then since z = y 


when s = 0, 


We could equally well have computed the s-derivative for arbitrary s. It is 
sufficient for our purposes to note that this derivative exists. 

We assert that the ¢t-derivative of d7'/ds) exists. The second factor in the 
right of (13.4) possesses a ¢-derivative and it is sufficient to show therefore that 
the first factor does also. The existence of this derivative is a consequence of 
the relation 

i 


and the fact that the t-derivative of dy’/dx* exists (see (10.2)). 
We obtain an explicit formula for the t-derivative of d7'/ds) as follows. Let 


—té t F 


ow! 
Formula (13.4) now becomes 
dTi., 
(13.7) = = = Bin’) 


and (13.5) becomes 6; = >> B‘ Di.. Since the matrices B, D are inverses of 
each other, so are their transposes. Hence 6; = >.B}; D;. Combining this 
with (13.7) we obtain C* = )) 7"D'. Now take the ¢-derivative of this last 
relation, multiply both sides by Bj , sum with respect to k and use the relation 
>Bi Di = 6. The result after a transposition is 


j 
dt dt 


Now 


(22) 


dt 
dD; _ d _ (= ©) 


Hence 


m. 
cal 
) is 
the 
if 
| 
(13.5) = (e e*x) = omy 
m- 
— 
the 
he 
i, 

° k i aD; ( 
h 


506 P. A. SMITH 


On substituting into (13.7a) we obtain finally 


dT? ; 

We define a vector field (t&)on by the formula 
(13.9) ((té)on)'a = 

7 


The B} (13.6) do not involve y. It follows that (t£)o7, regarded as a function of 
£, » defined over the totality of vector fields of class 1 over R, is linear in ». 
Formulas (13.7) and (13.8) now become (with x omitted) 

d 


(13.10) d8 Ts: = (té)on, 


(13.11) = (té)olE, 


14. Closed systems of class 1 


Let &, --- , & be vector fields of class 1 over R C S, and let L (& , --- , &) 
be the totality of linear combinations of the £; with real coefficients. If the com- 
mutator of every pair of elements in L is in L, we shall call L a closed system of 
class 1. (It is understood of course that a definite coordinate system in S, has 
been chosen.) 

Let £ be an element in the closed system L(& , --- , §-) of class 1 and let fj(t)= 
(((tE) o€ a)’ where x is regarded as fixed. We assert that the real-valued func- 
tions i (t) are analytic. For, [é, &;] = pi C jx€ Where the C’s are constants. 
Then, omitting the index 7 for the moment, and using (3.11), 


— (w)olé, = X Cul = L Cah 


Thus for each 7, the f} satisfy a system of linear homogeneous first order equa- 
tions with constant coefficients, and assume definite values when ¢ = 0. Hence 
they are convergent power series in ?. 

(14.1) If &, » are in the closed system L - ,°++, &) of class 1 then 


(té)on = + dé, a] +5 lé, ml] + - 


Proor. We have seen that ((téo£’)x can be represented as a power series int. 
We determine the coefficients. By (13.11) 


Settin 
expan 
in ¢. 
Col 
then ( 
cients 
Sur 
by 
we fir 
(14.3) 
We 
(14 
(14.5) 
Let ¢ 
(14.6 
Pr 
| Hence 
(14.7 
| The | 
(14.8 
from 
ee (1: 
| Let 2 
(14.1 
Pr 
T 1.0 1 


ce 


FOUNDATION OF THEORY OF LIE GROUPS 507 


Setting ¢ = 0 and using the obvious relation (O£)oé; = £;, we obtain the required 
expansion for 7 = &. It holds for arbitrary 7 since (¢£)e¢ and [E, ¢] are linear 
in ¢. 

CoroLuary (14.2). If &, » belong to a closed system L (&, ---, &-) of class 1, 
then (t&)on belongsto L. As a linear combination of , & , (té)on has coeffi- 
cients which are convergent power series in t. 

Suppose that closure relations of the system L (& , --- , §-) in (14.1) are given 
by [é&, &] = > cits. Let C, be the matrix (cj,). On putting & = & in (14.1) 
we find that 


We shall make a useful application of this formula. 

(14.4) Let c; be r* real numbers such that 
(14.5) = + checks + = 0. 
Let Cn = (ci). Then 


Proor. It follows from (14.5) that 


Hence if we associate to each matrix C; a vector field & by the formula ii = 
> ciix', we find that [& , &] = citi. Hence by (14.3) 


7 
The left side equals (see 13.10) 
0 07 


from which (14.6) follows. 
(14.9) Let é, 7 be vector fields in a closed system L = L(&, --- , &-) of class 1. 
Let X be the linear transformation ¢ — [é, ¢] of L into itself. Then 


(14.10) = where 7! = 


Proor. Evidently the transformation 7. defined in §13 is additive in s: 
T 12 T tw = T1240. Hence if we write y = we have (ef. 12.3) 


(14.11) Trey = ((té)on)y. 


Now the expansion in (14.1) may be written (té)on = e*n. 


N- 
of 
C- 
a- 
_| 
t. | 


508 P. A. SMITH 


The vector field n’ = e'*y being in L by (14.2), is of class 1. Hence e""’z is well 
defined and (14.11) implies that 


— 
e = 


which, on replacing t by —t yields (14.10). 

THEOREM (14.12). Let L (&, ---, &) be a closed system of class 1 over an open 
set R of a space S, with coordinate system x',---, x". Assume that the £; are 
linearly independent over R. There exists a G, with an analytic coordinate system 


a’, --+, a" such that the function 


defines an effective realization (G,, Sn). 
Proor. Let y = a-x. Then 


oy = lim 1 (er oe 


Replace e“”** by then transfer to the left step: by step using 
(14.10). We obtain 


oy 1 sf. —alx —aPxX 
lim; (e "y — y); 
where X; is the linear transformation — [€;, Thus 

0(a-x) _@ 

da? dso 

From (14.2) ¢, is a linear combination of & , --- , &,; its coefficients are power 
series in a’, ---, a”. Hence 

O(a-x) _ 

= 


where the ¢; are analytic. From (14.3) we see that (0(a-x)/da@”)areo = &p(2). 
Hence if we put a = O in (14.14) we find that the matrix (¢/,(0)) is the identity. 
Hence our theorem follows from (11.1). 

For later reference we shall compute the. ¢; more explicitly. Suppose 
Xé; = = Let C; = (ci). Then = Hence 
when p > l, 


(14.15) fp = Dobe = = 
14. = (e @ 

( 4 16) ) —aP-1¢ 

When p = 1, ¢i(a) = 6. 


e"?(a-x) =¢,(a-2). 


(14.14) 


15. The commutator 


We now examine more closely the commutator function [a, 6] defined by 
(7.2), assuming the coordinate system of G, to be canonical. We know that 


[a, b 
o = 
we | 


(1 


whe! 
f in 
sym 
seco! 
Ti 
| syste 
the f 
(15.4 
wher 
coor¢ 
G, a 
= 
ever’ 
so tl 
Ya 

) 


ell 


ng 


er 


dy 
at 


FOUNDATION OF THEORY OF LIE GROUPS 509 


a, b] is linear in b. We now show that [a, b] is at least homogeneous in a. Let 
o = st. Then if we use (7.2) and the fact that in canonical coordinates ta = a‘, 
we have 


(15.1) In analytic canonical coordinates 


[a, b] = b ‘a ‘ba’, 
dro 
Proor. Using the homogeneity of [a, b] we have (17.3) 


2 


= (—ta)(sb)(ta) = (b + tla, + 5 bl] + = 9b" say. 


Using the expansion (4.1) we obtain 
(b ‘a ‘b'a')' = (6 “(a ‘b’a‘))* = ((— sb)(sb’))’ = s(— b' + + 
stla, b] + st’ po 


where p; and p3 are power series in 8, p2 a power series in ¢. The coefficient of 
f in (15.2) is zero. On evaluating (b ‘a ‘b’) a’ in a similar manner we see that, 
symmetrically, the coefficient of s’ in (15.2) is zero. Hence the only term of 
second degree is st [a, b]. Therefore, on putting s = ¢ we obtain 


(15.2) 


(b ‘a ‘b‘a‘) = [a, 
t—0 
THEeorEM (15.3). Let G, be a local group with analytic canonical coordinate 
system a',---, a’. Suppose there exists an effective realization of G, defined by 
the function 


(15.4) a-x= ev it see 


where &,---, , are vector fields of class 1 over an open set R in a space S, with 
coordinate system 2°, ---,2x". Then the &; are linearly independent over R and 
L(é&, --+ , ) is closed. Moreover the linear correspondence a — >. a’; between 
G, and L is such that if a — &, b — n, then [a, b] — [E, 7]. 

Proor. If the &; are not linearly independent over R, we may suppose that 
& = ati + --- + cit, identically over an open subset R’ of R. Then for 
every a = (0,--- , 0, a’) with | a’ | sufficiently small we have a-x = x over R’ 
so that the realization could not be effective-—Suppose now that a — § = 
> a't;,b = > Then = (ta)-2 = a'-x, ex = b'-x. Hence 

Trot = € = a 
or, by (7.3) and the homogeneity of a~’ba, 


= (sb’)-x2,b’ = b+ tla, b] +---. 


we | 

y. 
se 
ce 


510 P. A. SMITH 


On taking the derivative of both sides of (15.4) with respect to s at s = 0 we 
have (see 13.10, 15.4) 


(won = = Dove. 


Now take the derivative with respect to ¢ at ¢ = 0 using (13.11). We obtain 


Hence L is closed and [a, b] — [, 7]. 
(15.5) In an analytic canonical coordinate system, [a, b] satisfies the identities 


(15.6) [a,b] = — [}, al, [a, [b, + [b, [c, al] + [e, [a, b]] = O. 


Proor. The given G, always possesses a realization (G,, S,) satisfying the 
hypotheses of (15.3) (the regular realization for example, by virtue of (12.9)), 
The identities in question then follow by (15.3) from the corresponding identities 
(13.1) and (13.2). 

Coro.uary (15.7). [a, b] is linear in both a and b. 

(15.8) Let G, , G, be local groups with analytic canonical coordinates a’ and 
a’’ respectively. Suppose there exists a local homomorphism 1: G, > G,. 
Thens Sr. Moreover, if 7a = a’, 7b = b’, then [a, b] = [a’, b’]. 

Proor. By (5.4) 7 induces a linear mapping of the space G, onto G,. Hence 
s can not exceed r. Moreover, since r(ta) = ta’, we have 


(= = 08) (a"). 
Hence by (15.1), [a, b] = [a’, b’]. 
16. Lie groups and algebras 


We shall say that a local group G is an r-parameter Lie group if it is locally 
isomorphic to a G, with left-differentiable coordinates satisfying a right Lip- 
schitz condition. To make it clear that this definition admits no ambiguity 
concerning the number 7, we remark that if G is locally isomorphic to G,—a 
local group of the same type as G,—then r = s. This is an immediate conse- 
quence of (15.8). 

An r-parameter (real) Lie algebra L, is an r-dimensional vector space 


fa, b, ---} over the field of real numbers, with a bilinear non-associative “prod- 
uct” [a, b] satisfying (15.6). 
Suppose G, has an analytic canonical coordinate system a’, ---, a’. Then 


it follows from (15.5, 15.7) that if we use the commutator [a, b] for product, G, 
is converted into an r-parameter Lie algebra. We denote this algebra by L(G,). 

Let us now regard two Lie algebras as equivalent if they are isomorphic, and 
two Lie groups if they are locally isomorphic. Then the totality of r-parameter 
Lie groups falls into a system I, of equivalence classes and the algebras fall 
into equivalence classes &.. 

To each element 7, of I’, there is associated a unique element I(7,) of &, in the 
following way. By (12.4), y, contains a G, with analytic canonical coordinates 


a’,---,a@. We take for I(y,) that equivalence class of Lie algebras which con- 


then 
| (16 
inver' 
| It 1 
given 
We 
with 
Let (( 
syste! 
with 
realiz 
are ol 
| (16.2) 
(The 
(ta) «3 
closec 
n-par 
morpi 
| Fre 
of (11 
over | 
is 
Let 
(16.3 
To sk 
| 
| the n 
(16.4) 
For, i 
£; wh 
matri 
in so 


0 we 


btain 


tities 


the 
2.9)). 
tities 


and 


cally 
Lip- 
Buity 


onse- 


space 


yrod- 


Then 
t, G, 
(G,). 

and 
neter 
fall 


1 the 
ates 
con- 


FOUNDATION OF THEORY OF LIE GROUPS 511 


tains L (G,). It is of course 14 pad ae remark that if G, is a local group with 
analytic canonical coordinates a” ” and if G’, is locally isomorphic to G, 
then L(G’) is locally isomorphic ra UG). This follows from (15.8). 

(16.1) The correspondence y, — &(y,) is one-one between I’, and &, . 

Proor. We note first that the correspondence y, — [(y,) has a gingle-valued 
inverse. For, isomorphism between L(G,) and L(G’) implies local isomorphism 
between G, and G. This is a consequence of (12.8). 

It remains to be shown that for every element I, of %, there exists an element 
y, of T, such that [(y,) = [.. For this it is sufficient to show that if L, is a 
given r-parameter Lie algebra, there exists a G, such that L(G,) = L,. 

We make a preliminary remark. A second method of associating a Lie algebra 
with a given G, with analytic canonical coordinates a’, --- , a’ is the following. 
Let (G, , S,) be an effective realization of G, such that relative to some coordinate 
system x’, --- , 2" in S, the functions (a-x)' have continuous second derivatives 
with respect to a',---, a’. Such a realization always exists,—the regular 
realization, for example. The vector fields 


are of class 1 and 
(16.2) a-xz=a 


(The proof of (16.2) is contained in (12.3) with ¥; replaced by £; and a‘ b by 
(ta)-x.) By (15.3), &, ---, & are linearly independent and L(é&, ---, is a 
closed system. Hence L(é,,---, §) may be regarded as an n-dimensional 
n-parameter Lie algebra L: (G, ) associated with G,. By (15.3) Le (G,) is iso- 
morphic to L (G,) 

From these remarks it follows by (14.12) that in order to complete the proof 
of (16.1), it is sufficient to show that in a space S, with coordinate system 2’, 

, x” there exist vector fields & , --- , & linearly independent and of class 1 

over some open set # and such that, as an r-parameter Lie algebra, L(&, --- , &,) 
is isomorphic to the given Lie algebra L, , the ¢’s being such that L is closed. 

Let a;,---, a, bea basis for L, , and let {a;, ax] = >> It will be sufficient 
to find é’s such that , &] = , that is such that 


* Oxi 

To show that such ¢’s exist, it is sufficient to show that there exist vector fields 
¢:, + ,@, in S, which aré analytic in the neighborhood of x = 0, are such that 
the matrix (¢} (O)) is non-singular, and which satisfy the equations 

965 

For, if we take for (Ei (x)) the inverse of the matrix ($}(x)) we obtain vector fields 
£; which satisfy (16.3) (the verification of this is straightforward). Since the 
matrix (¢}(O)) is non-singular, the ¢’s will be analytic and linearly independent 
in some neighborhood of x = 0 


512 P. A. SMITH 


We shall show that the functions ¢} given by (14.16) (with a replaced by 2) 
have the required properties. Evidently these functions are analytic and 
the matrix (¢}(O)) is the identity. To verify (16.4), suppose p < q. Then 
d9',/dx* = 0. Hence, denoting the left side of (16.4) by ® and writing y; = x C, 
wehave » 


Now on account of (13.1, 13.2) the numbers c}, satisfy (14.5). Hence we may 
use (14.6) writing it in the form 


= (Cro). 


We use this formula in (16.5) to shift C, to the left, step by step, giving m sue- 
cessively the values p — 1, ---,0. We obtain 


The proof in the case p > q is the same. When p = q both sides of (16.4) 
vanish identically (the right side because c}, = — cj;). 


+17. Local subgroups of Lie groups 


Let us regard a local group consisting of a single element as a “Gy with analytic 

canonical coordinates”. Then it follows directly from (5.7) and the definition 
of Lie group that every local subgroup of a Lie group is a Lie group. 
_ Let H be a local subgroup of a G, with analytic canonical coordinates. By 
(15.7) H may be regarded as imbedded in a flat subspace H® of G,. Regarded 
as a subset of the Lie algebra L(G,), H® is evidently a subalgebra (this follows 
from (15.1)) and this subalgebra is precisely L(H). 

(17.1) If H is a flat subspace of G, with analytic canonical coordinates a’, «++, a’ 
and if H, regarded as a subset of L(G,) is a subalgebra L’ of L(G,), then H isa 
local subgroup of G, . 

Proor. Since canonical coordinates will remain canonical after undergoing 
a linear homogeneous transformation, we may suppose that H is the (a’, --- , a”)- 
coordinate subspace of G,. Now let ¥; be the vector fields defined by (3.1). 
We have (12.9) 


(17.2) ab = (a = --- , 0")). 


By (15.3), LQ, --- , ¥-) isa Lie algebra isomorphic to L(G,). The isomorphism 
is given by: a Hence the subgroup of , ¥,) which cor- 
responds to H is precisely L(i,--+, Wp). Since this subsystem is closed, it 
follows from (14.12) that the function 


1 
a-b=e"™... where a= ,0) 


‘defin 
with 
a of 
(17.2 
grou 
We « 
whe! 
total 
the ¢ 
sion 
(1 
L(G, 
(i.e. 
Li 
be a 
the 
norn 
exar 
clusi 
conc 
aca 
itsel 
by o 
| is Ze 
of 
cont 
| 
2. E 
3. E 
J 
5. J. 


by 2) 
and 
Then 


may 


Suc- 


16.4) 


lytic 
ition 


By 
rded 


lows 


hism 
cor- 
d, it 


FOUNDATION OF THEORY OF LIE GROUPS 513 


‘defines a realization of a p-parameter group G, (which, as a space, is identical 


with H). Let a; denote the projection on the a’-axis of an arbitrary element 
aof G,:a; = (0, It follows from (17.2) that when a eG, = H, 


(17.3) a-b = ay, --- a,b, hence a-O = a ---a,. 


Now let a, c be arbitrary elements of H near O, and let d = ca where ca is the 
group product defined in the group G,. From (17.3) we have 


dy ---d,=d-0 =c-(a-0) +++ Gp. 


We draw the following conclusion: consider elements of G, of the form a --- ap 
where a; is an arbitrary element on the a’-axis and | a;| < 6, 6 small. The 
totality of these elements zs a local subgroup of G,. By (5.7) this local subgroup 
can be regarded as part of a linear subspace H’ of G. Evidently H’ contains 
the a'-,--- , a’-axes. Hence H’ contains H. It is easy to see that the dimen- 
sion of H’ can not exceed that of H. Hence H’ = H. 

(17.4) Suppose H is a local subgroup of G, so that L(H) is a subalgebra of 
L(G,). If H is normal (i.e. if, near O, b « H implies aba’ ¢ H) then so is L(H) 
(ie. b ¢ H implies [a, b] « H) and conversely. The first statement follows from 
(15.1), the converse from (7.2). 


APPENDIX 


Let R be a spherical region in a euclidean space EZ, , B its boundary. Let « 
be a continuous mapping of R = R + B in E, such that for each point b in B, 
the vector b, ob, if it is not of length 0, makes an acute angle with the outward 
normal to B at b. Then R C oR. 

The proof can be based on the concept of the degree of a mapping (see for 
example: Lefschetz, Algebraic Topology). Suppose that, contrary to the con- 
clusion, there exists a point x in R not covered by oR. Using the boundary 
conditions for o, it is easy to see that, without disturbing the relation 2x ¢ oR, 
o can be made to pass continuously into a mapping o’ which transforms R into 
itself and in particular leaves every point on B invariant. Since x is not covered 
by o’R, the degree of o’, i.e. the algebraic number of times that o’R covers R, 
is zero. But by standard topological methods it can be proved that the degree 
of o’ equals the degree of the mapping B — B induced by o’, namely +1. This 
contradiction establishes the theorem. 


CoLtuMmBIA UNIVERSITY. 


REFERENCES 


1. Garretr Brrkuorr, Analytical groups, Trans. Am. Math. Soc. 43 (1938) 61-101. 

2. E. Cartan, La théorie des groupes finis et continus et l’analyse situs, Mem. des Sciences 
(1930) Fase. XLII. 

3. E. R. van Kampen, Remarks on systems of ordinary differential equations, Am. J. 59 
(1937) 144-152. 

4. B. L. vAN DER WAERDEN, Gottingen lectures on continuous groups (mimeographed). 

5. J. H.C. Wuireneap, A note on Maurer’s equations, J. Lond. Math. Soc. 7 (1932) 223-227. 


| | 
+. a’ 
isa 
oing 
3.1). 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


COMBINATIONS OF CLOSURE RELATIONS 


By OystEeIn ORE 
(Received October 27, 1942) 


The present paper contains a study of the system of all possible closure rela- 
tions definable over a given set. It has already been shown by Garrett Birk- 
hoff [1] that this system may be considered to be a structure so that the inter- 
relations of the various closure relations indicate properties of this structure, 
One finds that the structure is a complete dual point structure satisfying the 
so-called Birkhoff condition. It is found furthermore that to every closure 
relation there exists a unique maximal closure relation relatively prime to it 
From this result one obtains a condition for two closure relations to be com- 
plementary. The implications of the Dedekind law are studied briefly. The 
application of one closure relation after another is called the product of the two 
and a necessary and sufficient condition for the product of the two closure rela- 
tions to be a closure relation is given. 

It is shown that the automorphisms of the structure of closure relations are 
all induced by the one-to-one transformations of the basic set. By means of a 
general theorem on the homomorphisms of point structures all complete homo- 
morphisms are determined. To conclude there are a few remarks about closure 
relations in which evefy point is closed or also determined by its closure. 

In another paper some similar investigations will be carried through for 
additive closure relations. 


1. Basic properties of closure operations 


In the following S shall denote a fixed set. A correspondence I which asso- 
ciates with every subset A of S some unique subset A of S 


shall be called a closure relation or closure operation when it has the three prop- 
erties: 
Closure: 

Idempotent: A = A 

Increasing: ADA 

Order preserving: A; 2 Az implies A; 2 A2. 
The set A is called the closure of A under T and any such image set A is closed 
under T. The closure of the whole set is S and by an additional definition one 
prescribes that the void set 0 is closed. 


Let us mention a few simple properties which follow directly from the defini- 


tion of a closure relation: 7 
A set is closed if and only if A = A. 
514 


clos 


{ 
T 
F 
(1) 
and 
(2) 
Fro 
T 
A 
sect 
sect 
to 
low 
1 
ring 
inte 
A 
spa 
of s 
ove 
whi 
bas 
fror 
for 
by 
tior 
sha 
for 
T 
to 
FE 


rela- 
Birk- 
nter- 
‘ture, 
the 
osure 
it. 
com- 

The 
» two 
rela- 


S are 
of a 
omo- 


for 


rop- 


osed 
one 


fini- 


COMBINATIONS OF CLOSURE RELATIONS 515 


The closure A of a set A is the smallest closed set containing A. 

For the closures of sums and intersections of the sets in some family §(A,) 
of subsets of S one finds 
(1) = BUA 
and 
(2) A: = A; 2 

From (2) one concludes further: 

The intersection of any family of closed sets is closed. 

An intersection ring of sets in S is a family of subsets which includes the inter- 
section of any two of its sets and a complete intersection ring contains the inter- 
section of any subfamily of its sets. An intersection ring of sets over S is defined 
to contain the sets 0 and S. This terminology permits us to express the fol- 
lowing basic result (see Z. H. Moore [1)): 

TuHEeorEM 1. Jn a closure relation the closed sets form a complete intersection 
ring of sets over S. Conversely any closure relation is obtained from a complete 
intersection ring R(A) over S by associating with each set A the smallest set A in 
containing A. 

A set S in which a closure relation has been defined is also called a topological 
space and instead of a closure relation we may use the term tepology. Any family 
of sets in S defines a topology through the complete intersection ring of sets 
over S which it generates. For a given closure relation any family of sets from 
which all closed sets can be obtained by forming set intersections is called a 
basis for the closed sets. As usual the complement of a closed set is an open set. 
The open sets form a complete sum ring of sets over S, and any family of open sets 
from which all other open sets can be obtained by summation is called a basis 
for the open sets. 


2. The structure of closure relations 


The system of all closure relations in a set becomes a partially ordered set 
by the following definition of an inclusion: Let T; and T: be two closure rela- 
tions and [,(A) and [.(A) the closures of a set A in the two topologies. We 
shall say that contains and write T; 2 when one has 


T\(A) 2 


for every subset A of S. From this definition one obtains: 

THEOREM 2. The necessary and sufficient condition for a closure relation T; 
to contain another closure relation TY. is that any set closed under TY, ts 
closed under . 

Proor: When I; contains I, one has for any subset A of S 


516 OYSTEIN ORE 


so that 
T,(A) = 


and 1(A) is closed under T,. This may also be stated that the complete inter- 
section ring {t(T',) of closed sets under TI, is contained in the ring R(T.) of closed 
sets under T;. Conversely since for any closure relation IT the closure T'(A) 
is the least closed set containing A one must have 


2 
whenever 
R(T2) D> R(T). 


We prove next a result due to Garrett Birkhoff [1]: 

THEOREM 3. The closure relations over a set form a complete structure. 

Proor: One sees that in the partially ordered set of closure relations the union 
VT; of a set of closure relations I; is the closure relation defined by the sets 
common to all the intersection rings defined by the various ;. The intersection 
AT; of the same closure relations is the closure relation defined by the complete 
intersection ring generated by all sets closed under any I; . 

The system of all complete intersection rings of sets over S form a structure 
in which the cross cut 2; M RK: of two rings R, and MR» consists of their common 
sets and the union %, U %: is the ring generated by the sets in 2; and R.. From 
the proof of theorem 3 one concludes: 

THEOREM 4. The structure of all complete intersection rings over a set S is 
dually isomorphic to the set of all topologies in S. 

In connection with the definition of the cross-cut of several closure relations 
one should mention the following result: 

THEOREM 5. Any intersection 


D = {[ 


of closed sets T;(A;) from a system {T;} of closure relations can be represented in 
the form 


D = T\(D). 
Proor: Since 
DD 
one obtains 
= Ti(Ai) D T,(D) 
consequently 
I[ ra) = D> > D 


so that the theorem follows. Theorem 5 can also be stated in the form: 


THI 


relatio 
relatvo 

The 
others 


relatic 
this w 
all th 
that o 
minim 

Ast 
of the 
sectiol 
called 


for ev 
for ev 
The 
no otl 
an int 
single 
M =! 
that : 
where 
THI 
tions 
The 
and n 
tion N 
every 
But o1 
sets n 
This 
| 
except 
Its 
THI 
plete | 


T- 


COMBINATIONS OF CLOSURE RELATIONS 517 


THEOREM 6. The intersection AT; of a system of closure relations is the closure 


relation in which the closure of a set is the intersection of its closures in the various 
relations . 

The structure of closure relations has a universal element V containing all 
others. This universal closure relation is defined by the property that 


V(A) = 


for every set A ~ 0. Similarly the structure has a zero element contained in 
all others. This is the zdentical closure relation I for which 


I(A) = 


for every subset A of S. 

The structure of closure relations contains maximal elements M contained in 
no other closure relations except V. Such a mazimal closure relation must have 
an intersection ring of closed sets consisting of the trivial sets O and S and a 
single other closed set A. We shall call A the principal set of M and write 
M=M,. According to the definition of inclusion for closure relations one sees 
that a closure relation T is contained in the maximal closure relations Mz 
where A is a set closed under fT. Obviously one has: 

THEOREM 7. Any closure relation is the cross-cut of the maximal closure rela- 
tions in which it is contained. 

The structure of closure relations also has minimal elements containing I 
and no other closure relation. Among the closed sets in a minimal closure rela- 
tion N one cannot have all maximal sets S — a, where a is some element, because 
every subset is the intersection of such maximal sets and one would have N = I. 
But on the other hand if a closure is to be minimal it cannot omit from its closed 
sets more than one such maximal set. Thus we see that the minimal closure 
relations are those in which every set is closed except a single maximal set S — a. 
This can also be stated in the form: 

THEOREM 8. The minimal closure relations N, are those in which every set 
except the point a is open. 

It should be noted that the dual of theorem 7 does not hold: Not every closure 
relation is the union of the minimal closure relations which it contains. To see 
this we observe that every minimal closure relation has among its closed sets 
all those which omit two or more elements of S and it is not difficult to verify 
that only closure relations with this property can be written as the union of the 
minimal closure relations they contain. 

A structure is called a point structure if every one of its elements is the union 
of the points or minimal elements it contains. If every element is the inter- 
section of the maximal elements in which it is contained the structure may be 
called a dual point structure. Thus we have: 

TueorEM 9. The structure of topologies is a complete dual pane structure. 

According to theorem 4 the structure of complete intersection rings is a com- 
plete point structure. 


= 
| 

A 
n 
n 
" 
n 
n 
n 


518 OYSTEIN ORE 


A closure relation [ may contain certain maximal sets S — d among its closed 
sets, hence also certain points d which are open sets i.e. isolated points. We 
shall denote by 


Or = {d} 


the set of all isolated points for the closure relation [. This concept enters into 
the following tonsiderations. 

Two closure relations I, and I are said to be relatively prime when TT. = 1. 
This means that the intersection of the closed sets in the two closure relations 
constitute all subsets of S. Dually when ©; U lr, = V the corresponding inter- 
section rings of closed sets have no sets in common except O and S. We shall 
now characterize the closure relations which are relatively prime to a given rela- 
tion T. Let PM T* = Iso that every set is the intersection of two sets closed 
under TI and I* respectively. Since a maximal set S — a cannot be obtained 
as the intersection of two larger sets, one concludes that any such set must be 
closed either under T or under I'*. But conversely when this is the case any 
set in the intersection of sets closed under I and I* so that we conclude: 

THEOREM 10. Two closure relations T and I* are relatively prime if and only 
af every point is an isolated point in at least one of these topologies, hence if 


Or + Or = S 


From this analysis also follows that among the various closures relatively 
prime to a given closure T there exists a unique maximal one. This is the closure 
I’ whose closed sets are generated by those maximal sets S — c which are not 
closed under Tf. Thus we can state: 

THEOREM 11. To any closure relation T there exists a unique maximal closure 
relation TY’ relatively prime to T. This closure relation is defined by the property 
that the points in 


Or = S — Or 


form a basis for its open sets. 

From these results it is easy to derive the condition for a closure relation to 
have a complement in the structure of all closure relations. If I* is to bea 
complement of the closure relation T one must have 


ru r = J, rn re =1. 


The last of these conditions states that I'* shall be relatively prime to IT, hence 
one sees that if a complement exists the maximal relatively prime relation I’ 
defined by theorem 11 must be such a complement, in fact maximal among all 
complements. Since I and I’ shall have intersection rings of closed sets without 
common sets except 0 and S it follows that T can have no closed sets of the 
form S — C’ where C’ is a subset of the set Or of isolated points in I’. This 
is equivalent to saying that no open set under I can be a subset of Or so that 
we have: 


No se 

If a 
Us 

is 
no cl 
be ne 
sistir 
possi 
| Let 1 
At 
As tc 

and 

sure 
is n¢ 
from 
Ar 
theo 
| by 

| and 
equa 
Pr 
close 
Me ce 
of th 
com! 
by t 
If 
is pi 
follo 
If 

each 
lengt 
to V 
W 

(3) 
| in tl 


sed 
We 


aly 


COMBINATIONS OF CLOSURE RELATIONS 519 


TurorEM 12. There exists a complement to a closure relation T if and only if 
no set open under T' is a subset of the set S — Or of all non-isolated points under TY. 
If a complement exists there exists a unique maximal one containing all others, 
namely the maximal closure relation IY’ relatively prime to YT. 

Using the ordinary structure terminology we shall say that a closure relation 
T’, is prime over another I or I: is prime under T; when T; > T: and there exists 
no closure relation between T; and T,. This implies of course that there can 
be no complete intersection ring of sets between two such rings ti and 9: con- 
sisting of the closed sets in I; and [2 respectively. Consequently it must be 
possible to obtain from through the adjunction of a single set A» to . 
Let us form the intersection A; = A,A2 of Az with some set A; in %,. This set 
A; must belong to #1 because otherwise one could have adjoined A; instead of 
A» to R; and one would have obtained a complete intersection ring between 21 
and 92 against assumption. Thus a closure relation I: is prime under a clo- 
sure relation I, if the ring 92 corresponding to T, contains some set Az which 
is not the intersection of other sets in R. and MR; is obtained by omitting Ae 
from Re 

An important property of the structure of closure relations is expressed in the 
theorem: 

TueorEM 13. Let T; D I> be two closure relations, one prime over the other, 
and let T be an arbitrary closure relation. Then the closure relation T, U T is 
equal to or prime over T. U I. 

Proor: Let us denote by 91, Re, and R respectively the intersection rings of 
closed sets corresponding to T; , and According to the preceding remarks 
9. contains a single set A> not in R,. The ring corresponding to I’, U I consists 
of the common sets in 9t; and & and the ring corresponding to I: U I of the sets 
common to 92 and ®. If these two rings are not equal they can at most differ 
by the set As and theorem 13 follows. 

An immediate consequence of this theorem is: 

If the two closure relations T, and YT. are both prime over T; MN T2 then T, U Lz 
is prime over both T; and . 

From this so-called Birkhoff property follows as in a general structure that the 
following chain theorem holds: 

If two closure relations T D IY’ are joined by a finite chain of closure relations, 
each prime over the next, then any two such chains between C and C’ have the same 
length. 

For a finite set S the number of closure relations in a complete chain from I 
to V is found to be 2” where n is the number of elements in S. 


3. Analysis of the Dedekind law 
We shall turn next to the study of the so called Dedekind law 
(3) AN@BUr)=BU(ANT), ADB 


in the structure of all closure relations in a set. It is not difficult to verify by 


nto 
= I, 
ons 
all 
la- 
sed 
be 
ny 
aly 
_| 
Ire 
ot 
ure 
rty 
to 
ce . 
T’ 
ll 
ut 
ne 
is 
at 


520 OYSTEIN ORE 


examples that this relation does not hold in general for closure relations. Let 
us establish what the condition (3) implies for the closed sets under the three 
closure relations in question. The sets which are closed under Af) T are the 
intersections A,-C, where A; is closed under A and C; under T. Therefore the 
sets closed under B U (A I) are those which are simultaneously closed under 
B and Af) T, i.e. the sets B; closed under B which are representable in the form 


(4) By = Ai-C;. 
The sets closed under B U F are those which are closed simultaneously under 5 
and T 
D. = = C2 
hence the sets closed under AM (B U P) are the intersections A»-D» where A, 


is closed under A. If the relation (3) is to hold there must exist for every set 
of the form (4) some representation 


B, = = 
Since one can obviously assume that 


A» = A(B,), D, = (BU r)(B,) 
we can say: 
THEOREM 14. The necessary and sufficient condition for the Dedekind relation 
(3) to hold for the three closure relations A, B and T is that any set B, closed under B 
which is the intersection 


B, Ay: 


of a set A; closed under A and a set C, closed under T shall also be representable in 
the form 


B, = = A(B,)-(BU T)(B). 


We shall use this theorem to obtain conditions for the Dedekind relation to 
hold identically in one of the three closure relations when the other two are fixed. 
Let us assume first that B and T are given closure relations and we wish to deter- 
mine which properties they must have in order that (3) shall be fulfilled for all 
closure relations A containing B. In order to establish a necessary condition 
we shall take A = Mz, as a maximal closure relation. Here A contains B if and 
only if the principal set B; in A is a set closed under B. From theorem 14 one 
concludes that if an intersection B;-C; is closed under B one must have 


B, = B;-C; = A(B:)-(B U Y)(B)). 


There are only three possibilities, namely 0, S and B; for the set A(B;) and the 
first of these is excluded when we make the trivial assumption that B, is not void. 
When A(B,) = S then B, becomes closed under T. Finally when A(B;) = Bi 
one must have 


(5) B, = B;-C, = B;- (BUT)(B)) 


and 
this 
also 
toh 
By. 
ther 
4 It 

| (6) 

(7) 
\ all . 
| 
clos 
Asl 
inte 
Bit 

B. 

we 
iso 
14 « 
The 
clos 
ce 

so t 
sary 
be ¢ 
145 
T 
to h 


Let 
three 
» the 
> the 
nder 
form 


ler B 


e A, 
set 


le in 


n to 
xed, 
ter- 
r all 
tion 
and 
one 


the 
oid. 
B; 


COMBINATIONS OF CLOSURE RELATIONS 521 


and this condition is seen to be satisfied in all three cases. But conversely when 
this condition (5) is fulfilled for all sets B; the requirements of theorem 14 are 
also satisfied so that we can state: 

THeorEM 15. The necessary and sufficient condition for the Dedekind law (3) 
to hold for all closure relations A containing B is that whenever the intersection 
B,-C, of a set B, closed under B and C, closed under T is tiself closed under B 
there shall also exist a representation 


B,-C, = B,- (BU T)(B,-C)). 
It may be observed further that if the relation (3) holds for all A one also has 
(6) A=BU(ANT) 
for every A such that 
(7) BUrTDADB. 


Conversely when (6) holds for all A satisfying (7) the relation (3) is fulfilled for 
all A. 

Let us now assume in (3) that A and I are fixed while B represents an arbitrary 
closure relation contained in A. A particular closure relation B contained in 
A shall be constructed in the following way. We denote by B; = A-C; the 
intersection of a set A; closed under A and a set C; closed under T and adjoin 
B, to the sets closed under A to obtain the intersection ring of sets closed under 
B. Thus the closed sets under B are either closed under A or they have the form 


B; = = 


where A; is an arbitrary closed set under A which is contained in A,. Since 
we can assume A + B the set B; is not closed under A. Furthermore the set B; 
is obviously the largest of all sets closed under B but not under A. From theorem 
14 one concludes that if (3) shall hold one must have 


B, = Ay = A(B,)-(B U T)(B,). 


The set B U [(B,) cannot be closed under A because then B, would become 
closed under A against our assumption. From 


(B U = S By 
ce concludes therefore that 
(BU r)(B,) = By 
so that the intersection A,-C; becomes closed under Tr. Thus we have the neces- 
sary condition that every intersection A,-C, which is not closed under A shall 
be closed under T and this condition is seen to be sufficient according to theorem 
14 so that we can state: 


THEOREM 16. The necessary and sufficient condition for the Dedekind law (3) 
to hold for a fixed pair of closure relations A and T and for any closure relationB 


522 OYSTEIN ORE 


contained in A ts that any intersection A;-C; of a closed set A; under A and C, under 
IT be closed either under A or TY. 

Since this condition is symmetric in A and T' we have also obtained the further 
result: 

THEOREM 17. When 


AN =BU(ANT) 
for every closure relation B contained in A one also has 
rn (BUA) 


for every B contained in YT. 
Another observation to be made in this connection is that when (3) holds 
for all B contained in A one has 


B=AN BUD) 
for every B such that 
ADBDANT 


and the converse is also true. 

In the third and final case of the Dedekind law we shall investigate when the 
condition (3) holds for all closure relations [ for a fixed pair of closure relations 
A> B. Let us suppose that B; is a set closed under B but not under A. The 
closure A(B:) = A, of B,; under A then contains B, as a proper subset. We 
shall now select T = Mc, as a maximal closure relation with the principal set 
C, ~ 0 taken such that 


B, 
This means that C; has the form 
Ci = Ai 


where A; may be an arbitrary set contained in S — A,. According to theorem 
14 one must have 


B, = Ai:C; => A;:(B U T)(B,). 
But since one only has the possibility 
(BU = C1 


it follows that C is closed under B. Thus we have that for every set B, closed 
under B but not under A all sets 


<A: CS — A(B) 


must be closed under B. 
When this condition is satisfied the Dedekind law will hold for an arbitrary 


clost 
set . 


| 
is cl 
conc 
T 
(3) | 
evert 
bec 
T 
anal 
The: 

| and 
In 
anal 
elem 
exist 

(8) 
In a 
estal 

two 

(9) 

The 

(10) 
It is 
isom 
In 
anal 
obta 


under 


ther 


nolds 


ary 


COMBINATIONS OF CLOSURE RELATIONS 523 


closure relation T. If namely for some set C, which is closed under I and some 
set A; closed under A the intersection 


By = 
is closed under B then C, has the form 
Cy B, + Ai 


where A; is contained in S — A(B,) and C, is closed also under B so that the 
conditions of theorem 14 are immediately satisfied. 

TuHEeorEM 18. The necessary and sufficient condition that the Dedekind relation 
(3) hold for a fixed pair of closure relations A > B and an arbitrary T is that for 
every set B, which is closed under B but not under A all sets of the form 


= B+Ai, A: — 


be closed under B. 
There are a great number of laws related to the Dedekind law which can be 


analysed in the same manner and similar results can be derived without difficulty. 
These investigations have also been carried through for the distributive law 


AN (BUT) = ANB)UCNT) 


and its dual but the results shall not be stated. 

In connection with the Dedekind law one can also investigate when the 
analogue of the algebraic law of isomorphism holds. Let A and B denote two 
elements in some structure. The law of isomorphism states that there shall 
exist a structure isomorphism between the two quotient structures 


(8) AUB/A=B/ANB. 


In algebraic systems in which this structure isomorphism holds it is usually 
established by means of the so-called regular structure correspondence, which 
may be defined as follows. Let us denote by X and Y arbitrary elements in the 
two quotient structures so that 


(9) AUBD XDA, BODYDANB. 
The regular structure correspondence is then defined by 
(10) xX—BNX, YoaUY. 


It is not difficult to prove that these correspondences (10) establish a structure 
isomorphism (8) if and only if for every X and Y satisfying (8) one has 
(11) x=AU(BNX), Y=BN(AUY). 

In the case of the structure of closure relations the two conditions (11) can be 


analysed by means of the results obtained in the theorems 15 and 16 and one 
obtains directly: 


524 OYSTEIN ORE 


‘THEOREM 19. The necessary and sufficient condition for the regular structure 
correspondence (10) to define a structure isomorphism (8) is that any intersection 
A,B, of sets closed under A and B be closed either under A or under B and also be 
representable in the form 


= U B)(B,). 


4. The product of closure relations 


The result of applying one closure relation after the other shall be called the 
product of the two closure relations. If IT, and IT. are the two given closure 
relations we shall write 


X = Ti(P2(A)) 


The correspondence 
A- T,(T2(A)) 

is an increasing and order preserving correspondence, but it is not always idem- 
potent, so that the product of two closure relations is usually not a closure 
relation. 

Under special conditions the product may be a closure relation and we mention 
first the following simple case: 

THEOREM 20. When I, > I: are two closure relations, one containing the other, 
one has 


Proor: Since > one has 
= 


because I',(A) is also closed under T;. To prove the other half of the theorem 
we observe that from 


T,(A) D Te(A) 
follows 
= Ti(A) > Ti(A) 
so that 
T,(A) = 


Among the further properties of the product of two closure relations we 
mention the inclusions 


(12) r,U1r.(A) DT, X T2(A) + T2(A). 
Only the first of these needs any proof and this follows from 

r, U r.(A) D 
by taking the T,-closure of both sides and applying theorem 20. 


for 

so th 
TR 
WI 
of tw 
TI 
TE 
relat 
PR 
one | 
so tk 
relat 
and 
If 
acco 
two 
so tl 
can | 
T 
of tu 


ure 
ion 
be 


che 


m 


ve 


COMBINATIONS OF CLOSURE RELATIONS 525 


Let us say that the closure relation T contains the product I; X I: if 
X 
for every subset A. In this case one sees that 
P(A) 
so that we have shown: 


TuEorREM 21. The union 1, U IT is the smallest closure relation containing the 
product T, X Te. 

When this result is combined with (12) one obtains: 

THEOREM 22. The necessary and sufficient condition that the product T; X T2 
of two closure relations be a closure relation is that 


This in turn leads to another criterion: 
THEOREM 23. The necessary and sufficient condition for T; X T2 to be a closure 
relation is that the T,-closure of any T2-closed set be T2-closed. 


Proor: If 
= T2(B) 
one has 
= T2(B) = 
hence 


,T2(A) T,T2(A) 


so that the product T; X T2 is an idempotent relation and therefore according 
to a previous remark, a closure relation. Conversely let T; X T2 be a closure 
relation, hence according to theorem 22 


x = T; U 
and the condition of theorem 23 is satisfied. 


If both products X and X shall be closure relations one obtains 
according to theorem 22 


hence the order of the factors in the product is immaterial and we say that the 
two closure relations commute. Conversely if T; and T; do commute one has 


(T; X Ts) (Ti X Te) = (Ti X XK X Te) = XK 


so that T; X Ts is an idempotent relation, hence a closure relation. Thus we 
can say: 

The necessary and sufficient condition for both products T; X T2 and T2 X Ti 
of two closure relations to be closure relations is that T; and T: commute. 


526 OYSTEIN ORE 


When I, and commute the T;-closure of any T-closed set is T's-closed and 
similarly the T.-closure of any Tj-closed set is T';-closed. 


5. Automorphisms and characterization of the structure of closure relations 


Let us consider briefly the problem of determining certain characteristic 
properties of the structure of closure relations so that any structure with these 
properties is isomorphic to a structure of all closure relations over some set. 

We have already observed that every closure relation is the intersection of 
the maximal closure relations in which it is contained. This leads us to in- 
vestigate particularly the interrelation between various maximal closure rela- 
tions. The cross-cut M,/ Mz is a closure relation with the closed sets 


{0, S, A, B, A-B} 


where some of these sets may coincide. This shows that for a pair of maximal 
closure relations there can be at most one other maximal closure relation namely 
such that 


> M.MMg. 


Furthermore there are two other, usually non-maximal closure relations with 
the closed sets 


{0, S, A, A-B}, {0, S, B, A-B} 


containing the cross-cut of M a and Mg. Among the maximal closure relations 
one can introduce a partial order by writing 


(13) M, > Ms 


whenever A > B. This partial order can also be introduced in a purely struc- 
tural manner by saying that (13) holds if and only if there exists another maximal 
closure relation Mc¢ such that 


Ms >M,.NMe. 


Let us also mention that there exist minimal elements M, containing no other 
maximal closure relation in the sense of (13). Every maximal closure relation 
M, is uniquely determined by the set of all such minimal M, for which 


> M,. 


Clearly the set of all maximal closure relations by the inclusion relation (13) 
is a system isomorphic with the set of all subsets of S. This may also be ex- 
pressed structurally according to the Tarski-Stone theorem by saying that the 
set of all maximal closure relations forms a completely distributive complete 
Boolean algebra with respect to the inclusion (13). 

From these remarks one sees that if a structure is to be isomorphic to a struc- 
ture of all closure relations over a set, there must exist maximal elements and 
to any pair of maximal elements M,; and Ms; there can exist at most one other 
maximal element Me such that 


> M3. 


con: 
ord 
ord 
just 
of s 
in t 
so t 
of 
sor 
T 
rela 
I 
aut 
Un 
the 
tur 
we 
set 
is U 
ma 
aut 
Bo 
clo: 
sha 
(14 
Al 
ark 


al 
ly 


COMBINATIONS OF CLOSURE RELATIONS 527 


In this case we shall write M: > M2. One must then impose such axiomatic 
conditions on the maximal elements of the structure that this defines a partial 
order of the set of all maximal elements, and furthermore such that this partially 
ordered set is a complete Boolean algebra. Thus by the theorem of T'arski-Stone 
just mentioned every maximal element becomes associated with a unique subset 
of some set S and conversely. Finally one must postulate that every element IT 
in the structure is the cross-cut of the maximal elements in which it is contained, 
so that I becomes associated with a family of sets consisting of all the subsets 
of S which belong to the various maximal elements containing [. We shall not 
go into further details of such a theory. It may only be mentioned that a 
somewhat similar characterisation, has been carried out by the author for the 
case of the structure of all equivalence relations over a set (Ore [1]). 

To conclude let us mention one result which is almost an immediate con- 
sequence of the preceding remarks: 

TuEeorEM 24. The group of automorphisms of the structure = of all closure 
relations defined over a set S consists of all one-to-one correspondences of the set S 
to itself. 

Proor: Clearly any one-to-one correspondence of the set S represents an 
automorphism of =. Conversely let us consider some automorphism a of 2. 
Under a any maximal closure relation M4, must be transformed into another 
maximal closure relation M4. Those maximal closure relations M, in which 
the principal set A = a is a single element have also been characterized struc- 
turally so that a must transform each M, into some other closure relation which 
we can denote by Mas. This defines B as a one-to-one correspondence of the 
set S and the automorphism 8” *-a is seen to leave all Mz fixed. Since every Ma 
is uniquely determined by the set of M. such that M4 > M, it follows that every 
maximal closure relation, hence every closure relation remains fixed under the 
automorphism 6 -a. Thus we see that a coincides with the transformation 
B of the set S. 


6. Homomorphisms of the structure of closure relations 


Let us turn to the determination of the homomorphisms of the structure of 
closure relations. We shall recall that a homomorphism a is a correspondence 
between the structure = and another structure =“ such that every element in 
>“ is the image of at least one element in > and such that 


> Ty 
shall imply 
U r.)* = U re. 


A homomorphism a is said to be complete when the relations (14) hold for an 
arbitrary finite or infinite number of components. 
Let us consider first the so-called modular homomorphisms. We denote by 


(14) 


nd 
ms 
tic 
et. 
of 
n- | 
a- 
h 

r 
| 


528 OYSTEIN ORE 


A some fixed closure relation and define a correspondence a of = to the sub- 
structure of all closure relations contained in A by putting 


(15) rr*=TNa. 


Clearly this correspondence satisfies the first condition in (14). In order that 
it shall satisfy the second condition it is necessary and sufficient that for every 
pair of closure relations I, and T, one shall have 


(16) = (Tr, UT.) NA. 


When this condition (16) is fulfilled for all T, and I: we shall say that A is a 
distributive closure relation and the resulting homomorphism defined by (15) is 
a modular homomorphism. 

All the distributive closure relations may be determined as follows: We 
assume that the two trivial cases A = V and A = I are excluded and denote 
by D some set different from O and S which is closed under A. The two closure 
relations 


are taken as maximal closure relations whose principal sets C; and C2 are selected 
in the following manner: We take C; as an arbitrary subset of D and define 


C2 = C1 + Ke 
where Ke is some non-void subset of S-D. In this case the set 
= D-C, = 


is closed both under T; M A and I, M A hence under the left-hand side of (16). 
On the other hand the union I, U I; is the universal closure relation V so that 
the right-hand side in (16) is A. Thus we see that C; must be closed under A. 
We have shown therefore that a distributive closure relation must have the 
property that every subset of a closed set D ¥ S is closed. 

Conversely it is not difficult to verify that this condition is sufficient for a 
closure relation to be distributive. Let I, and IT: be two arbitrary closure 
relations and A a closure relation with the property that any subset of a closed 
set D # S is closed. In this case one sees that the sets closed under I, NA 
are those which are closed under [; or A and similarly the sets closed under 
I. fM A are closed under I; or A. Consequently the sets closed simultaneously 
under I, NM A and I. f/ A are either sets closed under A or sets closed both under 
I, and [, so that the relation (16) holds. The relation analogous to (16) for 
an arbitrary finite or infinite number of closure relations T; 


(17) aA) = (vr Na 


is also seen to hold when A is distributive. A closure relation A for which (17) 
always holds may be called completely distributive. Such completely distributive 


We 
to b 
com 
rela 
sets 
I 
dua 
(18 
Any 
rele 
clos 
(19 
| 
twe 
Ss 
set 
(20 
Tw 
set 
Ne 
wil 
als 
clo 
the 
clo 
be 
its 
pa 
th 
ha 


sub- 


that 
very 


is a 
) is 
We 


ote 
ure 


ted 
ine 


ve 


COMBINATIONS OF CLOSURE RELATIONS 529 


closure relations are seen to define complete homomorphisms of the structure. 
We have shown therefore: 

THeoreM 25. The necessary and sufficient condition for a closure relation A 
to be distributive is that every subset of a closed set D ¥ S be closed. In this case 
A is completely distributive and the corresponding modular homomorphism is 
complete. 

The homomorphism (15) associates with each closure relation I the closure 
relation NM A obtained from T by adjoining to the closed sets I (A) those sub- 
sets of (A) which are closed under A. 

In connection with the distributive relation (16) we shall also analyze the 
dual relation 


(18) Ua) N(r.U A) = (7, NT2) UA. 


Any closure relation A which satisfies this relation (18) for every pair of closure 
relations T; and Tz may be called dually distributive. Any dually distributive 
closure relation defines a dual modular homomorphism through the correspondence 


(19) ror =Tvua. 


Let us determine all dually distributive closure relations A. Again we omit the 
two trivial cases A = V and A = I. We denote by D some set different from 
S and O which is closed under A. Furthermore let C; and C, be two different 
sets such that 


(20) D= Ci D, C2 D. 


Two such sets C; and C2 can always be found provided D is not a maximal sub- 
set S-d’ of S. One of them can be taken as an arbitrary set containing D. 
Next we define T'; and I, as maximal closure relations 


Ty; = M., M., 


with the principal sets C, and C,. According to (20) the set D is closed under 
Tr, MN T. and A, hence under their union (IT MT.) U A. Consequently D must 
also be closed under the left-hand side of (18). But aside from O and S the 
closure relations Ty U A and Tr, U A can only have the closed sets C; and C. and 
these only when they are also closed under A. Thus in (20) both C,; and C, are 
closed under A and we have as a necessary condition for a closure relation A to 
be dually distributive that every set C containing a closed set D - O of A must 
itself be closed under A. 

Conversely when this condition is satisfied the relation (18) will hold for all 
pairs of closure relations T; and T.. If namely C; and C2 are sets closed under 
Yr, and T, respectively such that their intersection D = C-C2 is closed under A 
then C, and C2, must also be closed under A so that D is closed under the left- 
hand side of (18). Also in this case one finds 


A(T; UA) = UA 


rat 

che 

ra 

ed 

ler 

sly 

ler 

or 

: 

= t 


530 OYSTEIN ORE 


for an arbitrary set of closure relations I’; so that a dually distributive closure 
relation is at the same time dually completely distributive and the homomor- 
phism (19) is complete. 

The condition for a closure relation A to be dually distributive can be expressed 
in a somewhat different manner. When D is a set closed under A every maximal 
set S-d’ containing D is closed. Consequently the sets closed under A may be 
generated by the sets S-d’ where d’ runs through all isolated points d’ of A. 
Thus we see that the sets closed under A are those which contain the fixed set 


0, = S—Os 


where QO, is the set of isolated points under A. We state therefore: 

THEOREM 26. The necessary and sufficient condition that a closure relation be 
dually distributive is that its closed sets consist of all sets containing some fixed 
arbitrary set O',. Such closure relations are always dually completely distributive. 

We shall proceed to the determination of all complete homomorphisms of the 
structure of closure relations. The solution of this problem is derived as a 
consequence of an analysis of the homomorphisms of a general class of struc- 
tures. Let us assume first that = is some complete structure and a a complete 
homomorphism of 2. All those elements a in 2 which have the same image 
a“ under a are seen to form a complete substructure 22 of 2. We denote by 


a, and a2 respectively the universal element and the zere element of Z,« so that 


all elements a having the same image a“ lie between a; and a2 
QQ. 


In particular let e, be the greatest element in 2 having the same image e* as the 
zero element e of =. Since a and a U e, will always have the same image one 
sees that for any a 


(21) 


Next we suppose that 2 is a point structure so that every element a is the 
union of the set A, of those points p, which are contained in it 


a= V Pa 


PaCAa 


From a; > a2 one concludes Aq, D Aa, so that one can write 


Ae = Aa + C 
where C is the set of points p, contained in a, but not in a. We denote by 


c= V De 


the union of all points p, contained in C, so that one has 
(22) a, = U Cc. 
For any point p, in c obviously 


aN p. = e, Ca,Up,.Ca. 


The | 


and 
so th 
hom 

This 
as 
so tl 
and 
T 
dual 
and 
can 
| 
hom 
It 
ove 
ma) 
Q7 
T 
hon 
I 
= 
foll 
an 

zi 
one 
(23 
| Th 


ve, 
the 


at 


ne 


he 


COMBINATIONS OF CLOSURE RELATIONS 531 


The homomorphism a can now be applied to these relations. One obtains 
and from these two relations one concludes 
De = 
so that all points p, in c have the same image e* ase. But since a is a complete 
homomorphism and since c is the union of all p, one obtains further 


c 


This shows that e; > c where e; is the maximal element with the same image 
as e. Thus one concludes finally from (21) and (22) that 


a = a U 
so that two elements a’ and a” have the same image a“ if and only if 


and we have proved our main result: 
THEOREM 27. In a complete point structure all complete homomorphisms are 


dually modular. 
. A structure in which the finite chain condition is satisfied is always complete 


and it is easily seen that its homomorphisms must also be complete so that we 


can state: 
TuEroreM 28. In a point structure satisfying the finite chain condition all 


homomorphisms are dually modular. 

It had already been established that the structure of all closure relations 
over a set is a dual point structure, i.e. every element is the cross-cut of the 
maximal closure relations in which it is contained. From the dual of theorem 


27 one obtains therefore: 
THEOREM 29. In the structure of all closure relations over a set all complete 


homomorphisms are modular. 

I have not been able to determine all non-complete homomorphisms although 
certain types of such homomorphisms have been found, for instance by the 
following construction: Let 2» be some substructure of =. We shall introduce 
an equivalence relation in 2 by writing 

Tr, (mod Yo) 
provided there exists an element Ap in Yo such that (Ore [1]) 
T 1 a Ao = a Ao 


One finds without difficulty that this equivalence defines a homomorphism of 
= if and only if to every pair I, and I: of closure relations and every Apo in Yo 
one can find another closure relation Ag in Zo such that 


(23) Ao) U (P2M Ao)) A Ao = (1 U A Ad. 
This condition is fulfilled for a substructure of Zo which has the property that 


sure 
nor- 
ssed 
mal 
be 
g 
set 
be 
3a 
ete 
by 
he 


532 OYSTEIN ORE 


to every Ao in X» also every Ao is in Zp, where A* is the distributive closure rela- 
tion obtained from Ao through the adjunction of all sets Do contained in the 
closed sets D) # S of Ao. Whether all substructures 2» satisfying (23) can 
be obtained this way is not known. Nor is it known whether all non-complete 
homomorphisms are of this generalized modular form. 


7. Point closures 


In closure relations some assumption is usually made about the closure of 
point. The simplest requirement is of course: 

Closed points. Every point is closed. 

The general closure relations in which every point is closed are also seen to 
form a complete dual point structure 2’. This structure 2’ is homomorphic to 
the structure = of all general closure relations and it consists of all closure rela- 
tions contained in the closure relation Ip in which the points are the only closed 
sets besides O and S. In this structure >’ the maximal closure relations M’, are 
those whose closed sets are O, S the set A and all points. The structural theory 
of these closure relations is practically the same as for general closure relations. 

Instead of assuming that every point is a closed set one can make the weaker 
assumption that every point is determined by its closure: 

Point determination. When a = 6 then a = b. 


This condition can also be stated in a different form which is often more 


convenient in its applications. To arrive at this reformulation let us recall first 
that a complete field of sets over S is a complete ring R with the additional 
property that the complement S—A of every set A in § is also in the ring. Let 
us also recall that there exists a one-to-one correspondence between the complete 
field of sets over S and the partitions $(B;) of S such that each complete field 
® consists of the sums of the blocks B; of a particular partition B of S. (Ore [1]) 

Let §(A») be a family of sets covering the set S, i.e. every element a is con- 
tained in at least one set A,,. We assume that the indices m of the sets A», in 
the family run through an index set M whose cardinal number may be arbitrary. 
Since every such family of sets §(A») generates a least field of sets over S in 
which it is contained, there is also associated a unique partition $(By,) of S 
with every family §(A»). The blocks By, in this partition may be obtained 
by the following construction. Let a be an element in S and M, the set of all 
indices m, for which the set A,,, contains a. Similarly Mz; = S—M, is the set of 
those indices m2 for which A, does not contain a. Consequentlv the first. of 
the two sets 


my CM, 


contains a while the second does not. Thus 


(24) Bu, = II hay II Am," An; 


m, CM, m, CM, m2 CS—M, 


is a set containing a. Two sets By, determined by two different elements a 
and a2 are seen to be either disjoint or identical so that when M, runs through 


all st 
belot 
follo 
is th 
Al 
botk 
sam 
ther 
bec: 
b ar 
b he 
30 { 
sets 
sho 

by 
Gal 

E. 
Oy: 


rela- 
the 
can 


of a 


COMBINATIONS OF CLOSURE RELATIONS 533 


all subsets of / one obtains a partition $(By,) of S. The blocks (24) obviously 
belong to the field generated by §(Am). On the other hand from 


Am,*Bu, = Bu, = 0 


follows that every set A, is the sum of such blocks so that the partition $(By,) 
is the partition associated with the field generated by §(A»). 

After these preparatory remarks we can state: 

Turorem 30. The necessary and sufficient condition that in a closure relation 
every point be uniquely determined by its closure is that the family of closed sets 
define a complete partition of S in which every block consists of a single point. 

Proor: If a and b are two different points with the same closure then they are 
both contained in the same clesed sets A,,, so that a and b must belong to the 
same block By, in (24). On the other hand if a and b have different closures 
then one cannot have simultaneously 


a@>b, bDa 


because the closed set @-b would contain both a and b against the fact that @ and 
b are the smallest closed sets containing a and b respectively. Thus when a and 
b have different closures they belong to different blocks By, in (24). Theorem 
30 follows as an immediate consequence. 

From theorem 30 one concludes that when I, and I, are closure relations 
whose points are determined by their closures, the union Tr; U Tr whose closed 
sets are those common to I; and I: will usually not have this property. This 
shows that for the set of all closure relations in which every point is determined 
by its closure the natural definition of a union breaks down. 


YALE UNIVERSITY. 


BIBLIOGRAPHY 


GARRETT BrrRKHOFF [1]: On the combination of topologies. Fundamenta Math. v. 29 (1936) 
pp. 156-166. 

E. H. Moore: Introduction to a form of general analysis. New Haven Mathematical 
Colloquium 1910. 

OysTeIn Ore [1]: Theory of equivalence relations. Duke Math. Journ. v. 9 (1942) pp. 
573-627. 


1to 
to 
ela- 
sed 
are 
ory 
ns, 
ker 
ore 
rst 
val 
et 
te 
ld 
n- 
in 

| 
in 
S 
od 
ll 
of 
nf 

h 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


SOME REMARKS ON ALGEBRAS OVER AN ALGEBRAICALLY 
CLOSED FIELD* 


By C. Nessirt anp W. M. Scorr 
(Received October 11, 1942) 


The theory of rings with radicals is an interesting and far reaching problem of 
modern algebra.’ In this paper we have examined some aspects of algebras 
which may have radicals and whose coefficient fields are algebraically closed, 
Some of the methods employed clearly could be used for less restricted algebras, 
but a full extension of the results requires the solution of a number of problems 
still under investigation.” The authors feel that the theory of algebras over an 

algebraically closed field has some interest and value in itself, particularly in 
view of its immediate application to the representation theory of finite groups. 
Moreover, this restricted case is a valuable testing ground for theorems for more 
general rings. 

In the first part of the paper we have studied the concept of basic algebra. 
The basic algebras are semi-primitive subalgebras (i.e. modulo their radicals 
are direct sums of division algebras, in our case, direct sums of fields) which for 
the algebras under discussion play a role in some respects analogous to that 
of division algebras for simple algebras. Related to the basic algebras are the 
Cartan basis systems,’ and systems of elementary modules.’ The commutator 
algebras of matrix representations of an algebra, or what is equivalent, the 
algebras of homomorphisms of the related representation spaces, can be analyzed 
in a rather simple manner. 

We shall say that a linear function ¢ of an algebra a is symmetric, if for every 
a, 8B €a, y(a8) = g(Ba). In the case where a is over an algebraically closed field, 
and is also semisimple, the characters of the irreducible representations of a 
form a complete set of symmetric functions of a. When a has a radical, this is 
no longer true. In Part 2 of the paper, we discuss symmetric functions of 
algebras with radical. 

In Part 3 of the paper the regular representations are written in terms of 
elementary modules. 


* The following includes the second part of a dissertation written by W. M. Scott under 
the direction of C. Nesbitt, and accepted by the University of Michigan in January, 1941. 
[12], in the bibliography at the end, is the first part of the dissertation. A remark by R. M. 
Thrall, who read the dissertation, lead to the development of Part 1 of this paper. 

1 For some of the more recent developments, see the bibliography. 

2 B. Vinograde has done some work in this direction. See Bulletin of the American 
Mathematical Society, vol. 48, No. 5, Abstract 167. 

3 For a discussion of the Cartan basis system, see [11], §2. 

4 Elementary modules are defined in [12], §3. 


534 


Let 
and | 
(1) 
be a: 
n of 
(2) 
| the t 
the « 
(3) 
The 
4) 
of ii 
V 
aril 
rela 
(5) 
is Si 
phi: 
ter! 
fun 
f(x 
bet 
reli 
ma 
mo 
] 
(6) 
be 
alg 


of 
ebras 
osed, 
bras, 
lems 
an 
ly in 
Nps, 
nore 


bra, 
icals 
for 
that 
the 
ator 
the 
zed 


ald, 
f a 


an 


ALGEBRAS OVER AN ALGEBRAICALLY CLOSED FIELD 535 


I. Basic ALGEBRAST 
1. Main concepts 

Let a be an algebra with unit element over an algebraically closed field K, 
and let 
(1) a=a*+n 
be a splitting of a into a direct sum of a semisimple subalgebra a* and the radical 
n of a. We shall denote by 
(2) =a +a 


the unique splitting of a* into a direct sum of simple invariant subalgebras, and 
the components of a ea under thé splittings (1), (2) by 


a =a*+y 
The radical n possesses the series 


of invariant subalgebras of a. 

We shall call a vector space V which has a ring 6 as left operator system and 
a ring cas right operator system a (6, c) space if, in addition to the usual operator 
relations, for every «6, X eV, ¢ the associativity condition 


(5) = (XE) 


is satisfied. 'The condition (5) implies that each ¢ € ¢ determines a b-homomor- 
phism, X — X¢, of V considered as a 6 left space, and similarly each é € 6 de- 
termines a c-homomorphism of V considered as a c right space. If we use 
functional notation to denote b-homomorphisms of V, for example, write X — 
= Xf, X ofi(X) = Xb, then fif(X) = fiG(X)) = = Thus 
between elements of c and elements of the b-homomorphism ring m of V we have 
relations of form ¢ > f, > fi, ~fif. This shows that ishomomorphically 
mapped into the ring m’ inverse to m. We shall call m’ the inverse b-homo- 
morphism ring of V. 
Let now 


(6) a= Da, D (0) 


be a composition series for a considered as an (a, a) space. The analysis of the 
algebra a we are considering is simplified by the following theorem. 


t The notion of basic algebra has been discussed in lectures by R. Brauer. 
5 For the proof that such a splitting exists, see, for example, [1], p. 47. 


ery 
is 

of 

of 
der 
41, 


536 C. NESBITT AND W. M. SCOTT 


1.1 Let V be an irreducible (a, a) space.’ Then V is also an irreducible (a*, a*) 
space. 

Proor: nV is an (a, a) subspace of V. Then nV = (0) or V. If nV = V, 
then V = n'V for 7 = 1, 2, --- and this implies V = (0), in which case the 
theorem is trivial. The same remarks hold for Vn. If nV = Vu = (0), then 
fora = a* + nea, X €V, aX = (a* + n)X = a*X; similarly Xa = Xa*, In 
this case the operators a and a* produce the same effect on V, so that V irre- 
ducible as an (a, a) space implies V is irreducible as an (a*, a*) space. 

It is clear that a corresponding theorem holds for spaces having a as one- 
sided operator system. 

1.2 A composition series of a considered as an (a, a) space is also a composition 
series of a considered as an (a*, a*) space. 

Proor: Each composition factor group in* the series is an irreducible (a, a) 
space and by 1.1 is then an irreducible (a*, a*) space. 

1.3 Let V be an irreducible (a, a) space # (0). Then there exists a unique index 
patr (x, X), x, AX = 1,2, +--+ , or k such that Vox, = J. 

Proor: By 1.1, V = a*Va* = >-‘,_,a)Va>. Each a: Va; is an (a*, a*) 
subspace of V, and so is either (0) or V. At least one of these subspaces is dif- 
ferent from (0), say «Vox # (0). If V = a, contains another subspace 
a, *Var ¥ (0), then a, *Var = ar Vay , and since a = this gives a con- 
tradiction, and the theorem is proved. 

We shall say that an irreducible V is of type («, \) if V = af Vay . 

1.4 Let V be an irreducible (a, a) space ~ (0) and of type (x, X). Let eyes 
(a,b = 1,2, --+ , fy) denote a set of matrix units for the simple algebra a, . Then 
there exists a vector X = such that (a = 1, 2,---,f; 
b = 1,2, ---,f,) form a K-basis of V. 

Proor: The unit element e, = )-2%: aa Of az is a left identity operator, and 

= ofA, e,, is a right identity operator for V. If Y ¥ 0eV, eYa = Y, 
so that for some p, q, 0. But = er, qer,1¢5 
SO = X #0. Moreover, = X. Since a? = Ke,,i;, 

= Key. it follows that V’ = is an (a*, a*) 
subspace ¥ (0) of V, and since V is irreducible, then V’ = V. This shows that 
the elements (@ = 1,2, --- ,f, ,b = 1,2, --- , fx) give a K-basis for V. 

1.5 An irreducible (a, a) space # (0) and of type (x, X) is the direct sum of fi 
irreducible a left spaces and is also the direct sum of f, irreducible a right spaces. 

Proor: V = Vi+--- + Yaw where Vz, (d = 1, 2, , fx) is the a-left space 
with basis = 1,2,---,f.). Vais an irreducible = 
ij Space and so an irreducible space. Similarly, V= 
fing Wr , where W, is the irreducible a right space with wane Cx 1b 
(6 = I, 2 Sn). 

It will follow from a later result (cf. 4.1) that for V as in 1.5 ay is isomorphic 
to the inverse a?-homomorphism ring of V. 


6 An irreducible (a, a) space is one which does not contain a proper (a, a) subspace. 
7 6,» (Kronecker delta) = 0, u + v; 6,, = 1. 


are of 


| In tl 
Qu+1 
1.6 J 
Bu = 
Then t 
(7) 
form a 
PRO 
are lin 
for Qu/ 
Thi 
regard 
(8) 
The a 
elemer 
to be: 
factor 
| 
in [5], 
Ele 
that 1 
and a 
1.7 
KB: 
eleme: 
PR 
closec 
Let 
eleme 
as on 
show 
Le 
to th 
ther, 
conté 
Ky, 
| chose 
yield 


ALGEBRAS OVER AN ALGEBRAICALLY CLOSED FIELD 537 


In the following we shall denote the type of the composition factor group 
by (Ku Au), (u = 1, 2, --- , r), and residue classes ‘by < --- >. 
1.6 Let <B.> # <O> be chosen from (u = 1, 2, +--+, 7) such that 


Bu = 118 
Then the elements 


(7) Cay alBulr,, 1, 2, » Sew 


form a K-basis of a. 

Proor: Since <8,> <0>,u = 1,2, ---, rit follows that 8, , B,--- , 
are linearly independent. Since by 1.4 the elements é,, .c13uéa,,.» form a basis 
for du/Qu+1, the elements (7) give a basis for a. 

This basis has been referred as the Cartan basis system in [5], [11], [12]. In 
regard to this basis an element a ¢ a may be expressed as 


The additive group formed by the matrices 9u(a@) = (his(a@))as was called an 
elementary module of a in [12], and denoted by Su. 84 and $, will also be said 
to be of type (ku, Au), that is, of the same type as the corresponding composition 
factor group Qu/Qus1. The number of composition factor groups in (6) which 
are of type (x, \) is denoted by ca. These c, were called the Cartan invariants 
in [5], [11]. 

Elements 8, chosen as in 1.6 we shall call primitive elements of a. We observe 
that the system 6, --- , 8, is chosen with regard to a particular splitting (1) 
and a particular composition series (6). 

1.7 Let 8; , , bea system of primitive elements of a. Thenai = KB, + 
Kp. + --- + K6,is an algebra over K. Any other system y1, , of primitive 
elements of a yields an algebra which is isomorphic to a. 

Proor: @ is evidently closed under addition; it remains to be shown that it is 
closed under multiplication. 

Let 8:, , --- , Bi,, denote the m = c,, elements 8, of type (p, ¢). Then every 
element a of a such that e,,n0é@.11 = a is contained in K6;, + --- + KQi,,, 
as one sees from (8). Now let Bu, 8» be such that x. = p,A» =o. Then 8.8, = 
, and so is contained in + --- + K6;,,, and hence ina. This 
shows that @ is also closed under multiplication and so is a subalgebra of a. 

Let v1, --- , yr be a second system of primitive elements chosen with respect 
to the composition series (6); then y1, --- , 7, are linearly independent. Fur- 
ther, if y, is of type (p, o), then yo» = @p,117¥v@s,. , and by the above paragraph is 
contained in KB;, + --- + K@;,,. It follows that @ = Kp, + --- + KB, = 
Ky, + --- + Ky,. This shows that any two systems of primitive elements 
chosen with respect to the same composition series (6) and same splitting (1) 
yield the same algebra @. 


V, 
u=1,2,---,r 
len 
In 
b=1,2,---,fi, 
| 
on 
a) 
\- 


538 C. NESBITT AND W. M. SCOTT 


If now we have another composition series 
(6’) a= bh Dh D--- Db,D(0) 


different from (6), a theorem of Brauer’s shows that we may select complete 
residue systems P, for the au/au4: and Q, for the 6,/6,4: such that each P, is the 
same as some Q,.° Let 5 P. = then for some p, q, ¥ 0, and we 
may take Bu = Yo = x,,1p6€a,,q1- Then the system £1, --- , 8, of primitive 
elements with regard to (6) and the splitting (1) is the same as the system of 
primitive elements 71, --- , 7, with regard to (6’) and (1), and each is a basis 
for the algebra 4. 
Finally, if we had another splitting 


(9) a=a*"+n 
different from (1), the matrix units e,,;; of a* are congruent modulo n to matrix 
units e,,;; of a*’. Then if 8,,--- , 8, is a system of primitive elements with 


regard to the splitting (1), 6. = ex, = 1, 2; +++ , form asystem of 
primitive elements with regard to the splitting (9) and the correspondences 
+ + 6B, + 6B, , Cu € K, show that the basic algebras 
KB, + --- + KB, and KB; + .-- + KB) are isomorphic. 

We shall call the algebra @ determined by any system of primitive elements 
of a, the basic algebra of a. 1.6 shows that the basic algebra is unique up to 
isomorphism. > 

1.8 @ has a unit element. 

Proor: As the beginning of a composition series for a, we may take a = ¢ D 
(2 D-++ Dc where = a; +---+a:; +n. Then , and we may 
choose é,,1 as the primitive element corresponding to this factor group. Then 
= is the unit element of a. 


2. Connections between representations of a and a 


In the following we shall always assume that our representations are such that 
the unit element of the algebra represented is an identity operator on the cor- 
responding representation spaces. This excludes the possibility of representa- 
tions with 0-constituents. Also, as it is customary and convenient, we shall 
consider equivalent representations to be identical. 

2.1 There is a (1-1) correspondence between the representations of a and those 


of i. If U is a representation of a, and Y is the corresponding representation of i, 


then A and X% have corresponding structures. 


Proor: Let % be a representation of a, and let V be the representation space 
of Let 


(10) DV, (0) 


be a composition series for V considered as an a-left space, and 


8 [3], Th. (1.2A). 


let < 
a, 
that 


(11) 
is a K 
V./Vi 


(12) 


No 
B, the 
culati 


(13) 


is the 
is the 
senta 


(14) 
isa 
Co 
adap 
chose 


114 


by a 


(15) 


|| 
| 
= 
and 
(16) 
We 
to a- 
| 
part 
met! 
| 
resic 
A 


ALGEBRAS OVER AN ALGEBRAICALLY CLOSED FIELD 539 


let <O> be an element of Vi/Vay. Then for some » and 
0, hence since Cy,,aa = Xk = Cy 0. It follows 


that 
is a K-basis of Vi/Vi41. Repeating the process for each of the factor groups 
V;/Viw we obtain a K-basis of V consisting of vectors 
Xia = Cr ,a1Xh h=1,2,---,¢ 


12 


Now let @ = K@, + --- + KB,, and recall that for each primitive element 
8, there is an index pair (x, \) such that Bu = ¢,,18uer1. Then, an easy cal- 
culation shows that 


(13) 


is the K-basis of a vector space V which has @ as left operator system. Then V 
is the representation space of a representation 2 of a corresponding to the repre- 
sentation 9 of a. We observe that if V7, = KX, + KX;4, + --- + KX; then 


(14) (0) 


is a composition series for the G-space V. 

Conversely, if we have a representation %{ of 4, we may assume that a basis 
adapted to a composition series of the representation space V of % has been 
chosen in the form (13) such that for each X;, there is some v for which X;, = 
ey, Xn (hk = 1, 2,---,¢). We form a basis for a representation space V of a 


by adding to the basis of V, vectors | 
(15) Crp al Xh a=2,>--- where X;, = Cr, 
h = 


and requiring the associativity condition that for a, 6 ea 


(16) = (a8)X; . 
We observe that this is just the inverse of the process of passing from a-spaces 
to a-spaces. 


In the above method of obtaining an a-space V from the a-space V we used a 
particular composition series of V, and a particular choice of residues X,. The 
methods used in 1.6 adapted to the case where a is a left operator system only, 
shows that V is independent of the choice of composition series for V or of the 
residues X;,. Similar remarks hold for the inverse process of passing from an 
G-space V to an a-space V. It follows that there is a (1-1) correspondence be- 
tween the a-spaces and the a-spaces, and consequently a (1-1) correspondence 
between representations of a and those of a. 

An examination of the process of passing from the a-space V to the corre- 


ete 
the 
we 
ive 
of 
th 
of 
ag 
is 
0 
y 
| 
| 


540 C. NESBITT AND W. M. SCOTT 


sponding a-space V, and of the inverse process, shows easily that the representa- 
tions % and % have corresponding irreducible constituents, corresponding in- 
decomposable constituents, corresponding Loewy constituents,’ ete. In fact, 
if we calculate % by the basis (12), of V, we have the following scheme: 


An (a) A2(a) 


An(a)Aw(a) Au(a)] 
while for 9 we have, using (13), 


(18) a(Xi, , Xt) (Xi, Xe) (An(&) ) 
Ao (&) A 


(A (&) Au(@)) 
If we calculate the matrix in A which corresponds to & we find that in each A; ;(a), 
the only coefficient different from zero is that in the upper left corner, and 
recalling that X;,, = X; we see that this coefficient is 4;;(&). Thus, to obtain 
% from A we may take the matrices of & corresponding to the elements & and 
‘deflate’ them by striking out all rows and columns except those passing through 
the upper corners of the simple parts %;;. The simple parts %;;(Y;;) are linear 
combinations of the elementary modules and if = diGu, 
where the di; are fixed elements of K, then = di;S..° completes 
the discussion of 2.1. 


3. Classes of algebras 


An equivalence relation among algebras over K may now be set up as follows. 
We say that a is similar to 6 (and write a ~ 6), if the basic algebra @ of a is 
isomorphic to the basic algebra 6 of 6. It follows at once that the algebras over 
K are classified by this relation into disjoint classes of equivalent algebras. 

2.1 shows that all algebras belonging to a particular class have corresponding 
representations. In fact, if a ~ 6, we shall have correspondences of the follow- 
ing form: a-space V — a-space V © 6-space W < b-space W. The representa- 
tions &% and B obtained from V, W respectively, may be written with corre- 
sponding simple parts A;;, B,;;, but these may differ in their dimensions (see, 
for example, %, % in (17) (18)). 


* For a discussion of Loewy constituents, see [3], §5. 
10 This may be seen by computation of % and % by means of (8), (17), (18). See also 
[12], Th. 4. 


The 
by taki 
of any 

3.1 4 


Bry; 
¢. 


Let 
(19) 


denote 
where 
of V; 0 
left op 


(20) 


In tert 
sentati 
is tl 


and n’ 


phical 
partic 
itself. 

and V 


ll In 
operat 
ciativit 
system 
represe 

12 
other 1 
(forthe 


of a, b 
and, a 
choice: 
The 
mutat 

4.1 
as the 

(21) 

where 
| 


ALGEBRAS OVER AN ALGEBRAICALLY CLOSED FIELD 541 


The following theorem indicates that we may define multiplication of classes 
by taking as the product of two classes, the class containing the direct product 
of any pair of representatives of the two classes: 

3.1 Ifa = 6 X cthena = bxX@ 

Proor: Let 61, &,-°-: , 8, be a system of primitive elements for 6, and 
of elements for c. Then one may verify that 
Bay; i = 1, 2,---, p37 = 1, 2, --- , g) is a system of primitive elements for 
bX ec. The elements 8:7; is 6 X 


4. Commutator algebras 
Let V be an a-left space, and let 


denote the Ok aang of V into a direct sum of indecomposable a-spaces 
where Vac SV_e=---=V,s,. Leta’ denote the algebra of a-homomorphisms 


of V; a’ is again an ies with unit element over K. We consider a’ also as a 
left operator system of V." If aea, a’ ea’, X eV then 


(20) a’(aX) = a(a’X). 


In terms of representations we have that if V as an a-space produces the repre- 
sentation % of a, and as an a’-space produces the representation 2’ of a’, then 
%’ is the commutator algebra of &. Here % need not be a faithful representation 
of a, but %’ is faithful for a’. It is to be noted that a’ depends on the space V, 
and, as we shall see below, its structure may vary considerably with different 
choices of V. 

The main theorem on homomorphism algebras (or what is equivalent, com- 
mutator algebras) is 

4.1 The a-homomorphism algebra a’ of an a-space V of form (19) may be written 
as the direct sum 


(21) a =& +--- +4, 


where &, denotes a simple algebra of a-homomorphisms of V, = Vii + «+: + V,8,, 
and n’ is the radical of a’. 

Proor: Denote by e,,» an element of a’ which maps V,: onto V,», isomor- 
phically, and maps Vra , where the pair (7, a) ¥ the pair (p, 1), onto zero. In 
particular, let €,, u (op = 1, 2,--- , g) effect the identity mapping of V,: onto 
itself. Let ip be an a’ which maps onto zero for At, a) (p, 
and V,, onto V,; in such a way that 1pep pl = Then = 


1 In place of this, E. Artin strongly favors using the inverse algebra of a’ as a right 
operator system, in which case the commutativity relations (20) is replaced by an asso- 
ciativity relation (see (5)). We choose, however, in this section to take a’ as left operator 
system, as this seems a little more directly related to the idea of commutator algebra of a 
representation. 

12 The material of this section follows rather directly from ring theory established by 
other writers. See, for example, Brauer [2], lectures on the theory of rings by E. Artin 
(forthcoming), and the reference in footnote 2. 


in- 
ct, 
nd 
‘in 
nd 
gh 
ar 
es 
is 
er 
Ss. 
ig 
30 


542 C. NESBITT AND W. M. SCOTT 


maps onto V,, , and maps V;,. onto zero, (r,a@) (p,q). Further, e,, 
5peDquly,p» (Sas the identity mapping for a = b, and the O-mapping for a a b). 
We observe that each e,, pp 18 & primitive idempotent of a’. For, if 6 ij 
+ where are mutually orthogonal idempotents 0 of , then 
= = + €2V pp Would be the direct sum of the a-spaces 
Ven» energie to the assumption that V,, is indecomposable. We next ob- 
serve that @, = >} qo1 Ke}, is a simple algebra of degree s? over K. Also 
Li od 54 is a decomposition of the unit element e’ of a’ into a sum of 
mutually orthogonal primitive idempotents. It follows that 4 + --- + 4, 
is a semisimple —. of a’ which is isomorphic to a’/n’ and 1) results, 


4.2 Leta; D a) + can ay > (0) denote a composition series of a’ considered 
as an (a’, a’) let be a composition factor group of type a), 
p,o = 1,2,--+, org. Then there is an element Ye in a, such that (1) y, maps 


Va into Vn maps all other Vz. into 0, (2) = &,<y,>G, 

Proor: This is an immediate consequence of 1.4 applied to the irreducible 
(a’, a’)-space a, / ‘_ . We have from 1.4 that y, € a, exists such that €or at = 
y, and that (2) holds. But the homomorphism ¢) iy.és,1 maps Vo into Vn, 
and V;_ onto zero if (7, 2) ¥ (a, 1), so that (1) also is satisfied. 

By applying 1.5 we may obtain a Cartan basis for a’, and by means of its 
elements define a system of elementary modules of a’. 

We call the a-space V formed by taking one V,, from each set of isomorphic 
indecomposable subspaees of V, the reduced space of V. We may assume 


+ Va. 


4.3. The basic algebra of the a-homomorphism algebra a’ of the a-space V is the 
a-homomorphism algebra of the reduced space V of V. 

Proor: The elements y,,v = 1, 2,---, 9, (ef. 4.2) generate the 
basic algebra a’ of a’. It follows from (1) in 4.2 that each element Y» gives an 
a-homomorphism of V. We have only to verify that every a-homomorphism 
of V is linearly dependent on the y,. @ = )>3-1 e,.1 is the identity homo- 
morphism of V. Then any homomorphism @ of V satisfies @ = 262’ = 
11065 ,11 . may be extended to a homomorphism of V by setting 
6-Via = (0) for a ¥ 1. As a homomorphism of V, = 
linearly dependent on the primitive elements +, . 

Theorem 4.1 provides some information concerning the irreducible constituents 
of the commutator algebra %’. Since Y’ is a faithful representation of 0’, 
there appears - in YX’, corresponding to the simple algebra a, in (21), an irreducible 
constituent §, of degree s, (p = 1,2,---,g). It has been shown elsewhere that 
the multiplicity of equals of Thus to each distinel 
indecom posable part U, of X there corresponds an irreducible part %, of ue such 


that the degree of §, is the multiplicity of U, in U, and the multiplicity of %, in w 


is the degree of U, . 
For a simple, and hence also for a semisimple, the commutator algebras of 
faithful representations of a are all similar in the sense of §3. However, this 


may not 
an algeb 


represen 


The com 


1 O 
0 
m; 


matrices. 
We m 
from the 
44 If 
a-homom 
PROOF 


the corr 
43). 


be decor 
spaces. 

which m: 
a! = e’ a 


and appl 
namely 


Here the 
Xx h 
(22) 


in partict 


which sh 
into ( 
determin 
mapping 


ALGEBRAS OVER AN ALGEBRAICALLY CLOSED FIELD 543 


may not be true if a has a radical, as the following example shows. Let a be 


ty ay 1 O O 
an algebra over K having A = bo 4 and 8 = ( 1 De | as faithful 
0 0 


representations, where ; , 52 , Ha: denote elementary modules of 1 X 1 matrices. 


and for is 6” = 


The commutator algebra of Y is of form € = ( 
0 M 


O M 
matrices, and It, ~ Mt,. Here € is not similar to C’. 

We may obtain, however, the following theorem which is almost obvious 
from the results derived in §§2, 3. 

44 If a is similar to 6, and the a-space V corresponds to the b-space W then the 
a-homomorphism algebra a’ of V is similar to the b-homomorphism algebra b’ of W. 

Proor: It is sufficient to show that the theorem holds for the a-space V and 
the corresponding a-space V. We consider the reduced spaces V, V (ef. 
43). Let 


+9, 


be decompositions of V, V into direct sums of distinct indecomposable sub- 
spaces. Let e’ be the identity homomorphism of V, and e, the a-homomorphism 
which maps V, identically on itself, and V, on (0), A ¥ p. Then for any a’ €a’, 
a’ = e'a'e’ = , Where = maps V, into V,, and V, on (0), 
We take composition series 


; 


and apply the method of §2 to obtain bases of the form (12) for V, and V, 
namely 


O 
(° Mr ) where the Yt; are elementary modules consisting of 1 xX 1 


h = 1,2,---,te 
= Cn, (8 a) 
a = 1,2,--- 


Here the xi = én Xn ,h = 1, 2, --- , te form a basis for V,. If now, under 
de, X, Yee V,, then 


(22) Cy, h ,al Yi, 
in particular, 
Yin 


which shows that Y«V,. Let us denote the a-mapping of which takes 


into (h = 1, 2,-++, te) by Then & = > is an G-mapping of V 


determined by the a-mapping a’ of V. Inversely, we may start with an 4- 
mapping of V, and use the relations (22) to obtain an a-mapping of V. It is then 


en 
Iso 
of 
| 
ts. | 
red 
a), 
Ds 
dle 
ply 
nic 
the 
he 
an 
10- | 
is | 
ble 
nat 
ad | 
uch 
of 
his 


544 C. NESBITT AND W. M. SCOTT 


easy to verify that the a-homomorphism algebra of V is isomorphic to the a- 


homomorphism algebra of v, and from 4.3 it then follows that the a-homo- 
morphism algebra of V is similar to the a-homomorphism algebra of V. 


II. SymMetric FUNCTIONS OF a 


5. Preliminaries 


Brauer, Nakayama, and one of the authors have discussed in [5], [11], [9], [10], 
a class of algebras which they called symmetric algebras. An algebra is called 
symmetric if in a, considered as a vector space, there exists a hyper- 
plane which contains all commutator elements a8 — Ba of a but does not 
contain a one-sided a-ideal other than (0). Here we shall discuss hyperplanes 
of a which contain all commutator elements, such a hyperplane we shall 
call a symmetric hyperplane. However, we propose a slight change of 
language. We shall say that a function of a is linear symmetric, if in addition 
to linearity, g(a8) = ¢g(8a) for alla, 8 ea. A symmetric hyperplane yields 
a linear symmetric function of a, and conversely. For if the symmetric 
hyperplane h consists of all elements a ea such that ¥(a) = 0, Wa linear 
function, then the condition that all commutators a8 — Ba appear in h implies 
that ¥(a8) = y(Ba). In our discussion it is more convenient, generally, to 
speak of linear symmetric functions of a rather than of the corresponding sym- 
metric hyperplanes. For brevity, let us now speak of symmetric functions, it 
being understood that these shall be linear. 

The characters of the representations of a form an important class of sym- 
metric functions of a. 

We shall have occasion to refer to the regular representations ofa which we 
now define. Let 6, @&,°--, €n be a basis for a. For every a in a we have 
equations 


ag = Smk (a) Em 


where the coefficients Smz(a) lie in K. The first regular representation, 
of a is given by the homomorphism a — R(a) = (r;;(a)) and the second regular 
representation, S of a, by a — S(a) = (s;;(a)) where 7, 7 denote row and column 
index respectively. It has been shown that an algebra is symmetric if and only 
if there exists a symmetric non-singular matrix T which transforms Rt into 6.” 


6. Symmetric functions of a and of a 


Here we shall prove 

6.1. There is a (1-1) correspondence between the symmetric functions of a 
and those of the basic algebra & of a. 

Proor: Let ¢ be a symmetric function of a, then ¢ maps all commutators 


13[11], p. 654. 


the lat 
0, for y 


MA 


aB-Be 
A cor 
We « 
1) 
2) 
= 
3) 
If in ¢ 
ifa + 
(23) 
Asas 
to be 
type 
(24) 
Let 
2, 
(25) 
We: 
(26) 


no- 


ALGEBRAS OVER AN ALGEBRAICALLY CLOSED FIELD 545 
a@-8a of aon zero. We use here as basis of a the Cartan basis system (7). 
A commutator determined by two of these basis elements is of form. 


We distinguish three cases: 


1) Neither of the pairs { is satisfied. Then y = 0. 


2) One of the pairs, say the first, is satisfied, and the other is not. Then 
= Cx, ,a18uBv€r, 
3) Both pairs are satisfied. Then 


If in case 2) we take @,, to be the element we obtain y = é,, 
if a dor ku hence 


As a special case of 3) we take 6, to be of unmixed type (ku, ku)" and ég, c1Bv€r,,14 
to be , and obtain y = — Thus for By of unmixed 
type 
(24) = (Ex, b1 Bux, 

Let us denote by cu». the multiplication constants of the elements Bu(u = 
1, 2,---, r), that is, 


(25) BuBo = > Cuvwhw + 


w=1 


We shall now define a function ¢ of a which, in addition to being linear, satisfies 
the relations ‘ 


én = 0 if of mixed type“ 
wlny,1a) if 8. of unmixed type, 

the latter relation according to (24) being independent of a. Then since ¢(y) = 

0, for y as in case 3) we obtain successively: 


(Ex, vex, (Ex, vB ulx, 1b), 


?(BuB») ?(Br8u). 


4 A quantity of type («, \) is of mixed type if « ~ \, and is of unmixed type if « = A. 


led 

not 

nes 

all 

of 

Ids 

i |) 

ear 

lies 

to 

ym- 

it 

ym- 

ave 

n, 

ular 

umn 

only 


546 C. NESBITT AND W. M. SCOTT 


Here 8.8, and 8,8. are of unmixed types (ku, ku), (k», &») respectively. If, 
however, 6.8, # 0 is of mixed type, then 6,8, = 0, and the first relation in (26) 
shows that in this case also 2(8.8,) = $(8.8.). This shows that the symmetric 
function ¢ of a determines a symmetric function g of G. The inverse process js 
now evident, and the proof of 6.1 is completed. 

Let us call the function y,, of a defined by 


= trace H,(a), 


where §,, is an elementary module of unmixed type, the character of §,,. Then 
we may show 

6.2. Every symmetric function of a is expressible as a linear combination of the 
characters of the elementary modules of unmixed type. 

Proor: If ¢ is a symmetric function of a, then for a ea 


g(a) hav(a)éx,,,a1 Bu €n,,,10) 


and by (23) this gives 
g(a) = Bw Cxy,1a) 


where now the range of summation for w is determined by the elements 8,, of 
unmixed type. It follows from (24) and (26) that 


(27) g(a) = P(Bw)Pw(ax), 


and 6.2 is proved. 
If g is a symmetric function of a, then the symmetric function of a, which 
according to 6.1 is determined by 4, is P(Bw)Ww 


7. Center elements and symmetric functions 


From 6.2 it follows that the number q of linearly independent symmetric 
functions of a is at most equal to the number s = in Cx, of elementary modules 
of unmixed type. If gq = s we say that the algebra a is completely permutative. 
For a completely permutative algebra a the character of each elementary module 
of unmixed type is a symmetric function of a. 

Let us denote by p the rank of the center of a. Since for an element ¢ of the 
center H.(¢) = 0, Su of mixed type; H.(¢) = cw(¢)6:; where cu(f) € K, for De 
of unmixed type, then p < s.” For a completely permutative algebra p $ 
q = s. The inequality may hold. For example, take 


a= (2 5.) 


where the §; denote elementary modules, such that $;, 2 are distinct irre- 
ducible constituents of %, and 3 belongs to the radical. Then the characters 
of , , H2 are symmetric functions of a, and g = s = 2, while p = 1. 


15 [12], Th. 9. 


Or 


wher 
linea: 
for a 
for 
for y, 
Any 
He, § 
that 
Na 
corre: 


We 
8.1 
necess 
b) a be 
Pre 
linear] 
is an 
center 
have e 
of a of 

It f 
are bo 


| 

forms 

é =| 

of the 
6 (e 


If, 
26) 
rie 
is 


nen 


» of 


nich 


cters 


ALGEBRAS OVER AN ALGEBRAICALLY CLOSED FIELD 547 


One may, on the other hand, construct cases where gq is less than p. Consider 


D1 

Ds De 
Ds Da D1 
0 G 


where again the §; denote distinct elementary modules. Here s = 4. Three 
linearly independent elements a, 8, y of the center are obtained by taking: 


for a, H,(a), H.(a) to be unit matrices, and H;(a) = (0), i # 1,2; 
for B, H;(8) to be a unit matrix, and H;,(8) = (0), t #5; 
fory,  He(y) to be a unit matrix, and H;(y) = (0), i x 6. 


Any symmetric function of a is a linear combination of the characters of ,, 
$2, Hs,and The characters of 5; , and are unsymmetric, and it follows 
that q = 2. 

Nakayama’® has shown that for a symmetric algebra there exists a (1-1) 
correspondence between symmetric hyperplanes and elements of the center, and 
hence p = q. In fact, let T = (t;;) = T’ be a symmetric matrix which trans- 
forms the first regular representation, R, into second regular representation, S, 
that is, ST = TR,| | #0. Leta, --- , ¢, again denote a basis of a, and set 
= tije;. If gisa symmetric function of a, then > 21 o(é)e; is an element 
of the center of a. Conversely, if >>%; 6(e;)e; is an element of the center of a 
symmetric algebra a, then a symmetric function ¢ of a may be defined by g(€;) = 
6(e;). For the first of these statements it is not necessary that the symmetric 
matrix 7’ be non-singular. 


8. The case p = s 

We shall show, 

8.1 In order that the rank p of the center of a should equal s = ae Cex Ut 18 
necessary and sufficient that a) there be no elementary modules of a of mixed type and 
b) a be completely permutative, q = s. 

Proor: i) An element ¢ of the center is of form ¢ = ) c. (ee where 
= Be Of unmixed type (kw, Kw). If the number p of 
linearly independent center elements equals s, then each of the s elements e\” 
is an element of the center. In particular, ea. = >> 2 Cc.aa iS an element of the 
center. If now there were a basis element 8. of mixed type (x, A), we would 
have ¢,8u = Bu , while B.c, = 0. Thus, if p = s, there are no elementary modules 
of a of mixed type. 

It follows easily that the basic algebra @ is commutative. For if e“’, e” 
are both of type (x, «), we have, since these elements are in the center. 


(w) Cy) (y) ,@) 


= 


€x,al Bw By = €x,b1 By Bw €x,1b 


16 Cf, [9]. 


the | 
tric 
ules 
tive. 
dule 
the 
4 Dv 
ps 
irre- 


548 C. NESBITT AND W. M. SCOTT 


and multiplying both members on left and right by e,1- , we obtain 
28) = ByBw 


(28) evidently holds also when 8, , 8, have different types say (x, «) and (A, \), 
and so @ is commutative. 

Since @ is commutative every linear function of @ is symmetric. In particular, 
we obtain s linearly independent symmetric functions $,, by setting 


(29) Pw(By) = 0, ~ Ww; Pw(Bw) =1 (w = 1,2,--- , 8). 


Then 6.1 shows that there exist s linearly independent symmetric functions of 
a, so that a is completely permutative. 

ii) If there are no elementary modules of mixed type, then all Cartan basis 
elements 8, are of unmixed type and the basic algebra @ is of order s. If further 
a is completely permutative, then it follows from 6.1 that there exist s linearly 
independent symmetric functions of @. Hence every linear function of @ is 
symmetric (in particular, the functions (29)) which implies that @ is commuta- 
tive. Then by reversing the argument in i) we find that the rank p of the center 
of ais s. 

From the above proof it is evident that the theorem might have been stated in 
the form: The rank p of the center of a is equal to s = yt Cee if and only if the 
basic algebra a of a is commutative. 


9. Blocks 


In foregoing papers, irreducible representations, characters, Cartan basis 
elements and other entities have been classified according to “blocks”."’ These 
blocks correspond to invariant subalgebras which are direct summands of 4, 
and we shall identify the “blocks” with these summands. Let a” be such a 
block, and let s, = ; Ga Where the summation extends over Cartan invariants 
associated with a. Assume now that the rank p, of the center of a” is equal 
tos,. By8.1,a” has Cartan basis elements of unmixed type only. If a” con- 
tains elements of type (x, x), then all elements of a” are of type («, «), since the 
elements of type (x, x) form a direct summand of a‘”. We have proved 

9.1 If the center of a block a” has rank s, = >i on, then all elements of a” 
are of the same unmixed type. Moreover, a’ is completely permutative. 

Let us now take for a the group ring I of a finite group G formed with respect 
to a suitable modular field K."° Let I” now be a block of I, and let F,, ---, 
F,,, and Z,, --- , Zz, denote the modular and the ordinary irreducible represen- 
tations of G belonging to Ir”. It may be shown that the rank p, of the center 
of I” is equal to z,.° Each ordinary representation Z may be taken as a 


modular representation Z. We shall show: 


17 See [6], §9; [12], §6. 
18 Cf. [6]. 
19 See, for instance, the discussion of blocks by R. M. Thrall, On the decomposition of 


modular tensors II, (forthcoming). 


as 
] 
fun 
hy} 

(k, 

of Z 
(30, 
shor 
vf 
nect 
leas 
are | 
irrec 
The 
In 
algel 
sym 
right 
10 
Pr 
whi 
ing 

by 
is ma 
we he 
where 
aed 
21 [5 


ion of 


ALGEBRAS OVER AN ALGEBRAICALLY CLOSED FIELD 549 


9.2. The block T of the group ring T is completely permutative if and only if 
all the ordinary representations Z; belonging to 1” remain irreducible when taken 
as modular representations. 

Proor: Only if. Let q, denote the number of linearly independent symmetric 
functions of r. IT” is a symmetric algebra,” so p, = q,, and q, = 8 by 
hypothesis. From 9.1 it follows that r has elements of just one type, say 
(x, x), and s; = Cx. Then r has only one irreducible representation, which 
we denote by F,. Further, Z; has F, as its only irreducible constituent, Z; << 
d;.F,, where the notation is to indicate that F, appears d;, times as constituent 
of Z;. The relations 


21 
(30) Dr = = Ce = die 
show that d,; = 1,7 = 1,2, --+ , 2,, that is, Z; is irreducible. 

If. Any two ordinary irreducible representations Z;, Z; of ! may be con- 
nected by a chain Z;, Z., --- , Z», Z; such that neighboring members have at 
least one modular irreducible constituent in common. It follows that if the Z; 
are each irreducible as modular representations, then there is just one modular 
irreducible representation F, of and Z; = F,, dz = 1,7 = 1,2,--:, a. 
Then (30) holds, and [ is completely permutative. 


10. Symmetric algebras 


In case a is a symmetric algebra, the question arises as to whether the basic 
algebra @ is also symmetric. In terms of symmetric functions an algebra a is 
symmetric if there exists a symmetric function of a which does not map any 
right ideal of a on zero other than the 0-ideal. 

10.1. An algebra a is symmetric if and only if its basic algebra & 1s symmetric. 

Proor: Let us suppose a is symmetric and that 2 is a symmetric function of 
a which does not map any proper right ideal of Gon 0. Let ¢ be the correspond- 
ing symmetric function of a, and assume that the principal right ideal r generated 
by 

a = Bu 
u,ab 


is mapped on 0 by g, and that h2g(a)e,, 0. Since = 
we have from (23), (26) 


= = 0 


where now the summation runs through the 6, of type (k», »). This holds for 
y = 1, 2,---, 7, and it follows that the principal right ideal of @ generated by 


[11], p. 657. 
* [5], (5). 


d), 
lar, 
8). 
of 
asis 
her 
urly 
i is 
ater 
1 in 
the 

asis 
nese 
f a, 
th a 
ants 
qual 
con- 
the 
f a” 
onter 
as a 
= 


550 C. NESBITT AND W. M. SCOTT 


& = >>, h24(a)8. ¥ 0 is mapped on 0 by ¢, which gives a contradiction. Then 
a is symmetric if @ is. 

If a is symmetric, let ¢ be a symmetric function of a which does not map any 
right ideal of a on 0. If the corresponding function ¢ of & maps the principal 
right ideal of a generated by 


& = CuBu 0 


on 0, then the principal right ideal of a generated by & would be mapped on 0 by 
the symmetric function g. This shows that @ is also symmetric. 


III. ReGutaAR REPRESENTATIONS 


11. In this section a study is made of the simple parts of the regular represen- 
tations R and S of a. By the method used in [11] it is possible to obtain a 
reduced form” of the regular representations by using as a basis the Cartan 
basis system ordered in a suitable manner. For the sake of brevity, the compu- 
tations will be made only for the basic algebra a; from the results for @ it is easy 
to infer what the corresponding computations for a itself would give. 

To obtain a reduced form for the second regular representation © of @, the 
basis elements 6u, u = 1,2, --- , rare arranged as follows: 

(i) The 8, are classified according to the blocks to which they belong.” 

(ii) Sets B, are formed by taking all 8, such that \. = p. 

(iii) In the set B, are taken first the elements which do not belong to the 
radical (there is only one, namely eé,,1), then those which belong to it, followed 
by those which belong to ii’, and so on. 

Let us use p, as subscript for the first 6, of B, , and q, for the last element, so 
that Bo, = 

A detailed discussion of the splitting, under such an arrangement of the basis 
elements, of the regular representation into indecomposable and irreducible con- 
stituents has been given in [11] so here only the results will be stated. As a 
consequence of (i), the second regular representation, S, of & decomposes into 
parts S; which in fact are the regular representations of indecomposable two- 
sided ideals. Each set B, forms a basis for a right ideal of a, and the S; de 
compose into indecomposable constituents 8, corresponding to these sets B,. 
By the arrangement (iii) each such indecomposable constituent %, of S is split 
according to its upper Loewy constituents” and is in reduced form. 

The elementary modules §, of @ are of first degree, that is, they consist of 
1 X 1 matrices. The elementary modules 5,, corresponding to the By», = 
€pu(p = 1, 2,---, k) are the irreducible representations of 4. We shall now 
study the matrix in %, corresponding to the element & = ):-1 H.(a)B.. Let 


2 For the statement of the meaning of reduced form of a representation, see [12], In- 
troduction. 

23 B,, , 8» belong to the same block if there is a chain 8, , 82, «++ , 8» , By , Such that neigh- 
boring elements have at least one type index in common. 

24111), Th. 3. 


where 
of ele 
partic 

RE 
tains | 
of 
Loew: 


35 


i Bo € 
(31) 
u We 
denc 
| 
Also 
so th 
TI 
indec 
The 
ment 
modi 
11. 
may 
\ 


ien 


ny 


the 


ALGEBRAS OVER AN ALGEBRAICALLY CLOSED FIELD 551 


6, ¢B,, then 
aBy H,(&)By Be > ( Bu 


(31) U=Px, u=p, 


- 


We observe that cy. = 0 except when (ky, Ay) = (ku, kv). Further, if we 
denote by 7z the power of the radical to which 8, belongs, then from (31), 
t +t Stu. Then ¢yu = Ounless — 7. This gives that = 
Yow Cwoull w(&) where (kw Aw) = (ku, and 7» S tu — tr. In particular, when 
u = 2, (kw, Aw) = (Kv, Kv) and 7» = 0, hence w = p, , and Ey(&) = Ay, (@). 


Also, when v = p,, (31) becomes 


&*Bp, = pu = > H.(&)B: 
Z=Dp 
so that = H.u(&). 

The module &,, , consisting of the matrices E..»(&), is a simple part” of the 
indecomposable constituent 8, of S corresponding to the basis elements of B, . 
The computation of the regular representation of a merely replaces the ele- 
mentary modules §,, of the basic algebra & by the corresponding elementary 
modules §,, of a, and so we obtain 

11.1. The indecomposable constituent %, of the regular representation S of a 
may be written with simple parts Eur, (u = Po, Pp +1, SU), 


) 
= 


where the simple part Guy is of type (ku , ky), and is a linear combination >>» CwuDw 
of elementary modules Dw» of type (ku , Ky) which belong ton™, tT» S tu — Tr. In 
particular, Gy = Dp, , Sup, = Du. 

Remarks: Let ¢t = Tq, » that is, ¢ is the highest power of the radical which con- 


tains a basis element of the set B,. Let %, denote the upper Loewy constituent 
of %, which corresponds to n’. Then if we split %, according to its upper 
Loewy constituents, we obtain 


* For the definition of simple part of a matrix algebra, see [12], Introduction. 


| 
_| 
pal 
na 
tan 
pu- 
asy 
the 
| 
ved 
, SO 
asis 
On- 
iS a 
into 
wo- 
de- 
B, . 
t of | 
now 
Let 
| 
In- 
eigh- 


552 C. NESBITT AND W. M. SCOTT 


where the non-zero simple parts of &,,; belong to n' with i < m — j. The first 
Loewy constitueht &% is the irreducible representation D>, of a” and from 11.1 it 


follows that %,,o is of form 
( Su, 


tio = 


(Du, 


where Bu, , °°: , Bu, are the elements of B, which belong to i’. Then there 
exist elements of a for which the corresponding matrices in %, are of form 

| 


(32) 


where the parts %,_,,.0 in the first column may be arbitrarily chosen. If a isa 
quasi-Frobeniusean algebra”’ then &,0 contains just one elementary module of 
type (p, p*) (where * denotes a permutation of 1, 2,---,k). Ifa is symmetric, 
then &;,o consists of one elementary module of type (p, p). 

Analogous statements may be made in regard to the simple parts of the inde- 
composable constituents of the first regular representation. 


Tue UNIVERSITY OF MICHIGAN. 


BIBLIOGRAPHY 


1. A. A. AtBert, Structure of Algebras, American Mathematical Society Colloquium 
Publications, vol. 24, 1939. 

2. R. Braver, Uber die Darstellung von Gruppen in Galoisschen Feldern, Actualites scien- 
tifigues et industrielles, No. 195 (1935). 

3. R. BRavER, On sets of matrices with coefficients in a division ring, Transactions of the 
American Mathematical Society, vol. 49 (1941), pp. 502-548. 

4. R. Braver, On the modular representations of algebras, Proceedings of the National 
Academy of Sciences, vol. 25, (1939), p. 290. 

5. R. Braver, C. Nessirr, On the regular representations of algebras, Proceedings of the 
National Academy of Sciences, vol. 23 (1937), pp. 430-434. 

6. R. Braver, C. Nessirr, On the modular representations of groups of finite order, Uni- 
versity of Toronto Studies, Mathematical Series, vol. 4, 1937. 

7. R. Braver, C. Nespirtr, On the modular characters of groups, Annals of Mathematics, 
vol. 42 (1941), pp. 556-590. 


26 (11), Th. 2. 
27 For the definition of quasi-Frobeniusean algebra, see [9]. 


9. 1 
10. 7 
11. ¢ 
12. V 


it 


re 


ALGEBRAS OVER AN ALGEBRAICALLY CLOSED FIELD 553 


8. MARSHALL HALt1, The position of the radical in an algebra, Transactions of the American 
Mathematical Society, vol. 48, (1940), pp. 391-404. 
9. T. Nakayama, On Frobeniusean algebras, I, Annals of Mathematics, vol. 40, (1939), 
pp. 611-633. 
10. T. Nakayama, C. Nessitt, Note on symmetric algebras, Annals of Mathematics, vol. 39 
(1938), pp. 659-668. 
11. C. Nespirt, On the regular representations of algebras, Annals of Mathematics, vol. 39 
(1938), pp. 634-658. 
12. W.M. Scort, On matriz algebras over an algebraically closed field, Annals of Mathematics, 
vol. 43 (1942) pp. 147-160. 


st 
a 
of 
Cc, 
e- 
im 
n- 
he 
al 
he 
ni- 
cs, 


ANNALS OF MATHEMATICS 
Vol. 44, No 3, July, 1943 


OPEN ADDITIVE SEMI-GROUPS OF COMPLEX NUMBERS* 


By Ernar anD Max Zorn 
(Received March 3, 1943) 


The main object of the present paper is to determine the open, connected, 
additive semi-groups of the complex number plane. These sets are important 
as parameter manifolds of one-parameter semi-groups of linear transformations 
which have been studied in detail by one of us.’ It is shown that these param- 
eter manifolds depend on the upper semi-continuous solutions of the in- 
equality” 


+ &) S o(&) + o(&), 


the function ¢ being defined on the ray > 0 or on the whole line. The semi- 
group consists of all numbers £ + 77 with n > ¢(€). 

Finally we show that every open semi-group is the maximal domain of exist- 
ence of a one-parameter analytic semi-group of linear transformations defined 
on a suitable Banach space. 

It is obvious from the definition of a semi-group (see below) that the complex 
numbers may be replaced by any two-dimensional vector space over the real 
number field. Furthermore it turns out that our considerations apply with 
minor modifications to n-dimensional vector spaces, and many intermediate 
definitions and results Are valid in still more general cases. We may present 
this more general form of the theory, together with a study of closed semi-groups 
and with additional facts about the structure of open plane semi-groups at 
another occasion. 


1. Definition and structural properties 


Dealing with complex numbers as elements of a two-dimensional vector space 
A = Ap» over the reals we define 
1.1 A semi-group S of the (group-) space A is a subset of A which (i) is additive, 
(ii) possesses in every neighborhood of 0 an element different from 0.°* 


* Presented to the American Mathematical Society November 28, 1942. 

1 See E. Hille [5] in the list of references at the end of the present paper. 

2 This inequality has been studied by R. Cooper [4]. He assumes ¢() defined for all é 
and proves that a solution which is monotonic for large positive (negative) & has the prop- 
erty that ¢(£)/ tends to a finite limit when £ — +«(—«). He also shows that an odd 
solution is of the form ké and an even solution is nowhere negative and gives examples of 
discontinuous measurable solutions. 

3 There exist additive sets which do not satisfy (ii). Assuming the set to be a convex 
domain, E. Hille has shown that it is the maximal domain of existence of a one-parameter 
analytic semi-group of linear transformations on Li(—7, 7) to itself. His construction is 
entirely different from that given in §3 below. 

4 If the group space is finite dimensional, A = A, , we suppose that a metric is introduced 
in A, , for instance the ordinary Euclidean one, and that the metric defines the neighbor- 
hood topology. We denote the length of a by |aj|. In the present paper n = 1 or 2. 


554 


tions 
Usin; 
2.1.1 


| 

T 
obse 
theo 
Ir 
The 

loss 

p2 
for i 
lim 

13 
In 
be a 
neig 
com 
Te 
in U 
neig! 

(x - 
W 
but 1 
Al 

1.4 
Su 
in 
set; 1 
cont: 

1.5 

It 
appl 
point 
there 
1.5.1 
Fo 

|| 


lex 
‘eal 
‘ith 
ate 
ent 
Ips 

at 


ace 


ve, 


ADDITIVE SEMI-GROUPS 555 


The closure § of a semi-group S is likewise a semi-group. The following 
observation—which is capable of considerable generalization—is basic for our 
theory. 

1.2 To every semi-group S of A there exists a vector b # 0 such that the ray pb, 
p = 0, ts in the closure of S. 

Indeed, there must exist a sequence {a;}, a; ~ 0, a; « S with lim a; = 0. 
The unit vectors a;/| a; | must have at least one limit point in A and without 
loss of generality we may assume lim a;/|a;| = b where 1 = |b| # 0. For 
p = 0 we determine a sequence of integers n; such that n; > 0, lim; |a@;| = p, 
for instance, n; = [p/| a; |] +1. The relation pb = (limn;|a;|) (ima,/|a;|) = 
lim na; shows that pb is limit of a sequence from S. 

1.3 An open semi-group S is the interior of its closure S. 

In other terms: if a point of A has a neighborhood on which S is dense it must 
be an element of S. Or, if a point x of A does not belong to S, then every 
neighborhood U, contains an open set which is not void and has no points in 
common with S. 

To prove this we take a neighborhood U> of the 0-element such that « — Uois 
in U,; in this neighborhood there will be a vector y which, together with a whole 
neighborhood U,, is contained in both S and Uo. The non-void open set 
x — U, is contained in U, , but has no points in S; for if x — uw werein S,x = u+ 
(x — u) would be, which is not true. This proves our lemma. 

We note that the proof of this lemma extends literally to Hausdorff groups, 
but in this paper we shall use it only for Az and A, . 

An important consequence is 
1.4 For an open semi-group S we have S+ SCS. 

Since the closure of a semi-group is a semi-group, S + S is first of all contained 
in S; and with every vector s + § we find a whole neighborhood U, + 3 in this 
set; in other words, every point of S + S is an interior point of S and therefore 
contained in S itself. 

1.5 For an open semi-group S at least one vector b ¥ 0 exists such that with s all 
vectors s + pb, p = 0, are elements of S. 

It suffices to choose the vector b in S which is furnished by 1.2 and then to 
apply 1.4. Since S is open, a whole neighborhood of s and in particular all 
points s + n’b with | n’ | < e(s), for suitable e(s) > 0, will belong to S. We may 
therefore state: 

1.5.1 For an open semi-group S there exists a vector b ¥ 0 and a function e(s) > 0 
such that with s all vectors s + n'b, n' > —e(s), are contained in S. 


2. Representation by inequalities 
For the sequel it will be useful to introduce the symbol — ~ with the conven- 
tions + = and > —~ for all real numbers 7. 
Using ¢ as a common letter for — © and reals we state without proof: 
2.1.1 <A set {n} of real numbers which contains with every all with > — 
e(n), e(n) > 0, may always be described by an inequality n > ¢. 


ied, 

ant 

ons 

4m- 
in- 

mi- 

ist- 

ned 

ull 

Op- 

odd 

s of 

vex 

eter 

n is 

iced 

or- 


556 EINAR HILLE AND MAX ZORN 


2.1.2 An inequality + = ¢3 is equivalent with the fact that m > ¢1, m > 
implies m + 2 > ¢s. 

Returning to the open semi-groups we introduce a vector a, not colinear 
with b, such that every vector may be written in the form ga + nb. For every 
element s = a + nb of S all vectors s’ = a + 7’b, for a suitable e(s) > 0 and 
all n’ > » — e€(s) are within S. The ordinates 7 of the elements in S which have 
the abscissa é fulfill thus the condition of 2.1.1, and we may say: 

2.2 An open semi-group consists of all vectors a + nb, where — runs through a 
subset T of A, and 7 satisfies n > ¢ (E). 

We denote a set which in this manner is determined by a set JT and a function 
g on as the “restricted product” (T, ¢). 

2.3 In order that the set (T, ¢) be additive it is necessary and sufficient that 
(i) T is additive, 
(ii) g(€) satisfies the functional inequality 


2.3.1 o(& + &) o(&) + 


If a + nib are in (7, ¢) then (& + &)a + (m + m)b is in (T, ¢) exactly if 
(i) & + & isin T, (ii) m + m2 > o(& + &) is implied by 9; > ¢ (&). The first 
of these conditions is additivity of 7, the second is equivalent with the inequality 
2.3.1 by virtue of 2.1.2. 

2.4 In order that the restricted product (T, ¢) be open in A it is necessary and 
sufficient that 
(i) T is an open set in Ay = {6}, 
(ii) the function o(£) is upper semi-continuous in the sense that for real o the 
set of all — for which o(&) < o be openin Ai. 
Note that by virtue of (i) we might ask instead that the set be open in 7. If now 
(T, ¢) is open, and n > g(£) there will be numbers ¢, 5 > 0 such that for | &’ — 
< «,| — n| < 6 the vectors + are in (T, ¢), or > o(€’). The 
numbers £’ are in T and constitute an A;-neighborhood of ~, which shows that T 
is openin A,. If 7 is less than a real number o we see that in this neighborhood 
¢(é’) is less than o by choosing 7’ = 7 in n’ > g(t’). Conversely, if conditions 
(i) and (ii) are satisfied, and for n > ¢(), let us choose a 6 > O such that 
v(&) < » — 6; since ¢ is supposed to be upper semi-continuous and T open there 
will be a whole neighborhood | &’ — £| < ¢ which is in 7 and where ¢g(é’) < 4 - 
6. Therefore g(é’) < 7’ will be true for | — < 7’ — <6, which means 
that (7, ¢) is open. 
2.5 In order that (T, ¢) have in every neighborhood of 0 an element ¥ 0 it is neces- 


sary and sufficient that 
(i) there exists at least one point of T in every Ai-neighborhood of 0, 
(ii) lim < 0. 


5 If we work with complex numbers we may apply a rotation and if necessary a reflection 
to S such that a = 1, b = 7, and such that there are points in S with abscissas greater than 
zero. 


th 
no 
thi 
tio 
up 
spi 
2.€ 
if 
= 
of. 
wil 
( 
2.7 
] 
for 
me 
2.8 
\ 
ele 
Th 
up} 
up 
2.9 
I 
a p 
far 
Ind 


on 


Lion 


han 


ADDITIVE SEMI-GROUPS 557 


We are using the definition lim,_.. = sup.so inf.;¢;<. and leave 
the proof of this lemma to the reader. Note that the first condition does 
not necessarily produce an infinity of points in every A,-neighborhood of 0; 
this will however be the case if the set JT is open in Ai. Altogether the condi- 
tions marked (i), valid jointly if (7, ¢) is or is to be an open semi-group, add 
up to the statement that T is an open semi-group of the one dimensional vector 
space 

2.6 An open semi-group T of A, is one of the three sets: § > 0, & < 0, Ai. 

Proor. The closure T of T contains a ray p(+1), p = 0, according to 1.2; 
if it contains an additional element »(+1), vy < 0, it contains all real numbers 
t= (nv + p) (+1). (Note that we have just determined all closed semi-groups 
of A.) In view of the generalization of 1.3 the semi-group will be the interior, 
with respect to A; , of one of these three sets, which proves our theorem. 

Combining this with the conditions marked (ii) we obtain the following de- 
scription of the open semi-groups. 

2. 7 After introduction of a suitable basis a, b an open semi-group consists of all 
vectors £a + nb with n > o(€) where 
(i) & varies over one of the three sets — > 0, & < 0, or Ai, 
(ii) the function ¢ has real numbers or — © as values, is upper semicontinu- 
ous, satisfies the inequality 2.3.1 and the condition lim,9 g(¢) < 0. Vice 
versa, any restricted product satisfying conditions (i) and (ii) will constitute 
an open semi-group of Ae. 

In this paper we record only such information about semi-groups as we need 
for the analytic developments of the following section. The next four state- 
ments suffice for our purposes. 

2.8 An open semi-group S ts connected. 

We show right away that S is arc-wise connected by exhibiting for any two 
elements £,a¢ + nib of S a polygon which connects them and is contained in S. 
The construction is possible because the upper semicontinuous function has an 
upper bound uv > ¢(€) on the interval &, S € S &. The polygon is then made 


up by the following three—possibly degenerate—arcs: 
@) m+ Ou m); 
Gi) €=& + — &), 7 = 4; 03081, 
(iii) = &, 7 =u + Om — u). 


2.9 An open semi-group S is simply connected. 

Let C be a simple closed curve consisting of points in S; we have to show that 
a point c inside of C, that is, a point which cannot be connected with arbitrarily 
far points by an are which does - intersect C, is necessarily contained in S. 
Indeed, consider the ray c — pb, p = 0. This ray will intersect C at a point 
8) = c — pob of S, and that implies, by virtue of 1.5, that c = s) + pob is in S. 
2.10 If S is not the whole plane Az there exist points whose distance from the set 

Sis > 0. 


ar 
ry 
nd 
ve 
la 
if 
rst 
ity 
nd 
the 
ow 
‘he 
t T 
od 
nat 
ere 
| | 


558 EINAR HILLE AND MAX ZORN 


That is so because S is the interior of its closure. 
2.11 If S is not the whole plane it does not contain the 0-element. 
For with 0 all sufficiently small vectors would have to be in S, and every vector 
is a multiple of arbitrarily small vectors. 


3. Maximal parameter sets 


In this paragraph we shall prove an existence theorem for analytical semi- 
groups of linear bounded transformations on a Banach space to itself. Some 
preliminary explanations and definitions of the concepts involved are in order, 

Let E be a Banach space of elements 2, y, --- , and let E* be the adjoint space 
of linear bounded functionals defined on E. Let S be a set of complex numbers 
s = o + ir and let {7,} be a one-parameter family of linear bounded trans- 
formations on E to E defined for s «8S. 

3.1. {7s} is said to be a semi-group if 

(i) the set S is additive, and 

Gi) T.Ta = = forallxeE andalls,teS. 

3.2. Tis said to be holomorphic in S if (i) S is a domain, and (ii) the complex 
valued functions L(T,.x) are holomorphic in S for all xe E and all Le E*. 8 is 
said to be the maximal domain of analytic existence of T, if every accessible boundary 
point of S is a singular point of at least one of the functions L(T.x).° 

This definition of maximal domain of existence disregards completely the 
possibility of quasi-analytic or more general non-analytic continuation of T, 
valid for all elements of E. We also disregard the possibility of finding an 
analytic continuation of 7, valid on some subspace of E. Simple examples of 
both possibilities can be found. 

N. Dunford’ has found a simple criterion for holomorphism in S: 

3.3. A necessary and sufficient condition in order that T, be holomorphic in S 
is that the difference quotient (1/h)(T.4n — T;)a, 8 + shall converge 
strongly to a limit when h — 0 for every x € E. 

We shall now prove” 

3.4. Given an open additive set S in the complex plane such that s = 0 belongs 
to S — S, i.e., S is an open semi-group in the sense of definition 1.1. Then there 
exists a Banach space E and a one-parameter family {T,} of bounded linear trans- 
formations on E to E defined for s in S such that (i) {T,} is a semi-group in the 
sense of 3.1, (ii) T, is holomorphic in S, and (iii) Sis the maximal domain of analytic 
existence of T, . 

That S is a domain was stated in 2.8. 

We shall give two slightly different constructions. In the first we take for E 
the set of all functions f(z) bounded and holomorphic in S and define ||f(z)|| = 


6 That is, to the accessible boundary point s) should correspond at least one function 
L(T,2) and at least one rectifiable arc C in S ending at so such that the radius of convergence 
of the Taylor expansion of L(7’,x2) about s = s; on C tends to zero when s; — 8» along C. 

7See E. Hille [5], pp. 6-7. 

8 We exclude the case in which S is the whole finite plane since the existence of analytical 
groups is well known. 


su 
th 
se 
th 

fo 
th 
in} 

1 
i | 
wl 
qu 
bo 
wh 
int 
do 
F(. 
| w 
bor 
E | 
] 
sul 
2n 
arb 
Th 
suc 
hay 
hay 

A. 
eler 
all ; 
But 
don 

has 
F 
holc 


- 


ce 


ADDITIVE SEMI-GROUPS 559 


supzes | f(z) |. This isa normed linear vector space complete in its metric. We 
then define T.f(z) = f(z + s) for se S. This is clearly a semi-group in the 
sense of 3.1. In order to prove (ii) we use the criterion given in 3.3. We have 
the lemma 

3.5. If s ands + h belong to S, we have 


| +s +h) —fe+s] || <2[h| || || 


for |h | < 36(s) where 6(s) is the greatest lower bound of the distance of z + s from 
the boundary of S when z ranges over S and s is fixed in S. 

The lemma can be proved with the aid of Schwarz’s lemma but Cauchy’s 
integral gives a more direct proof. We have 


1 


where C is a circle interior to S with center at w = z + s which also contains 
z+s+h. From this representation the inequality is an immediate conse- 
quence and 3.5 implies 3.4 (ii). 

We note that S is a simply-connected domain which omits the interior and 
boundary of a suitably chosen circle, by 2.10. Hence £ contains all functions 
which are bounded and holomorphic outside a suitably chosen circle. Of greater 
interest to us is the fact that E contains elements which have S as their natural 
domain of existence. Such elements can be constructed as follows. Let w = 
F(z) be a function which maps S conformally on the interior of the unit-circle, 
|w| < 1, and let 00 a, w" be a power series having | w| = 1 as its natural 
boundary and such that |a,| < «©. Then a, [F(z)]" is an element of 
E having S as its natural domain of existence. 

In order to prove (iii) we now argue as follows. We denote by E> the linear 
sub-space consisting of all functions f(z) in E such that lim,.. f(z) exists where 
z,éS and z, — 0 when n > ©&, {z,} being a fixed preassigned but otherwise 
arbitrary point-set. We denote this limit arbitrarily by f(0) when it exists. 
Then the convention Lo[f(z)] = f(0) defines a linear bounded functional on E> 
such that | Lo[f] | < || f ||. We note that for a fixed s in S and any f(z) in E we 
have f(z + s)¢Eo so that Lof[f(z + s)] = f(s). The Hahn-Banach theorem 
having been generalized to complex-valued functionals by F. Bohnenblust and 
A. Sobezyk [3], we can extend the functional LZ) with unchanged norm to all 
elements of E. We have consequently Lo[T.f(z)] = f(s) for every f(z) « E and 
allsin S. It is obvious that L,[7'.f(z)] is holomorphic in S as is required by 3.2. 
But we have just seen that there are elements of EF having S as their natural 
domain of existence. For such a choice of f(z) the function L[T.f(z)] = f(s) 
has S as its natural domain of existence. It follows that S is the maximal do- 
main of analytic existence of T,. This completes the argument in the first case. 

For the second construction we consider instead the space F, of functions f(z) 
holomorphic in S such that 


Weil dl < «. 


f(w) dw 


—z—s8)(w—z—s—h) 


0 
e 
n 
if 
S 
Js 
ic 
E 
yn 
= 


560 EINAR HILLE AND MAX ZORN 


We note that if g(z) « E, the space of bounded holomorphic functions in S, and 
if 2 = a has positive distance from S, then g(z) (z — a)” € EF, . Furthermore, 
this function will have S as its natural domain of existence if y(z) has this 
property. It is easy to see that EF, is a Hilbert space and a fortiori a complex 
Banach space. 

From this it follows that every bounded linear functional defined on EF, is 
given by the formula 


= [ de 


where g(z) is any element of H,. A particular case of this formula should be 
noted. It has been shown by 8. Bergmann [1] and 8. Bochner [2] that there 
exists a kernel K(t, z) such that 


ie) = | [ KG, de, 


where the integration is taken with respect to ¢. Here f(z) is any element of 
E, , K(t, z) depends only upon S but not upon f(z), K(t, z) = K(z, 0), and for 
fixed ¢ in S, K(t, z) is an element of Z;. Hence we may choose g(z) = K(a, 2), 
where a is a fixed point in S, and obtain a functional DL, such that L.[f(z)] = f(a). 

We now define 7.f(z) = f(z + s) for f(z) «F£,,s¢S. This definition satisfies 
3.4 (i). But 


urge) = | + de 


and this is clearly a holomorphic function of s in S so that (ii) also holds. 

Suppose now that s is an accessible boundary point of S and suppose that the 
maximal domain of analytic existence of T, should include s .2 Then there must 
exist a small fixed neighborhood of s = gs in which all the functions L[T.f(z)] 
are holomorphic. But if we choose in particular L = L,, then L.{T.f(z)] = 
f(a + s), and all these functions must be holomorphic in the neighborhood in 
question, no matter how we choose a in S and f(z) in £,. But this is impossible 
as we see by taking for f(z) a function having S as its natural domain of existence 
and choosing a sufficiently near to the origin. Hence S is the maximal domain 
of analytical existence of 7’, and the second construction has been completely 
verified. 


REFERENCES 


[1] S. Beremann, Uber die Entwicklung der harmonischen Funktionen der Ebene und des 
Raumes nach Orthogonalfunktionen, Mathematische Annalen, 86 (1922) 238-271. 

[2] S. Bocuner, Uber orthogonale Systeme analytischer Funktionen, Mathematische Zeit- 
schrift, 14 (1922) 180-207. 


9 If S has been normalized through rotation in such a manner that a = 1, b = 7 in the 
representation 2.7, then every vertical line z = — contains an accessible boundary point of S. 


[3] I 
[4] 
[5] E 
| 


ADDITIVE SEMI-GROUPS 561 


[3] H. F. Bounensiust and A. Sosezyx, Extensions of Functionals on Complex Linear 
Spaces, Bulletin of the American Mathematical Society, 44 (1938), 91-93. 

[4] R. Cooper, The Converses of the Cauchy-Hélder Inequality and the Solutions of the In- 
equality g(x + y) < g(x) + g(y). Proceedings of the London Mathematical 
Society, (2) 26 (1926-27) 415-432. 

[5] E. Hite, Notes on Linear Transformations. II. Analyticity of Semi-Groups. Annals 
of Mathematics, (2) 40 (1939) 1-47. 


YALE UNIVERSITY, 
UNIVERSITY OF CALIFORNIA. 


d 

s 

x 

e * 

f 

r 

). 

e 

t 

] 

n 

e 

e 
n 

y 4 
8 

e 


ANNALS OF MATHEMATICS 
Vol. 44, No. 3, July, 1943 


ON A PROJECTIVE INVARIANT OF A NON-HOLONOMIC SURFACE 


By Hsten-Coune Wanc* 
(Received November 9, 1942) 


1. Introduction. The purpose of this paper is to give a projective invariant 
of a non-holonomic surface in a three-dimensional space. This invariant can be 
regarded as an analogue of the projective linear element of an ordinary surface, 
since it has a geometrical meaning similar to the latter. Analytically, as we 
shall show below, it is the quotient of the product of two quadratic differential 
forms by the square of a Pfaffian form. While there are projectively distinct 
ordinary surfaces which have the same projective linear element,’ we shall 
prove in our case the main theorem that two non-holonomic surfaces have the 
same invariant when and only when there exists a collineation or a correlation 
carrying one surface to the other. 


2. Canonical frames and the projective linear element of a non-holonomic 
surface. Let x, y, z be a set of coordinates in a projective space of three dimen- 
sions. A non-holonomic surface S is defined by an equation of the form’ 


(1) A(x, y, z)dx + B(a, y, z)dy + C(x, y, z)dz = 0. 


Geometrically it associates to every point A of the space a plane m through the 
point. The plane z is called the tangent plane of S at A. There are in general 
two directions on 7 through A having the property that the tangent plane at a 
neighboring point A’ of A on any of these two directions intersects 7 along AA’. 
We assume these two directions to be real and distinct and call them the asymp- 
totic directions. The two lines through A along the asymptotic directions are 
called the asymptotic tangents.’ 

Following Cartan we take as a projective frame’ a set of four linearly inde- 
pendent analytic points A, A;, A2, As such that | AA,424;| = 1. To each 
point of the space we attach the most general family of frames A A1A2A; satisfy- 
ing the conditions: 1). A coincides with the point; 2). AA;, AAs» are the two 
asymptotic tangents. Such a frame depends on the coordinates of A and a set 
of “secondary parameters’ which determine the frame within the sub-family 
attached to the same point A. The totality of these frames attached to different 
points of the space satisfies a system of equations of the form 


* The author wishes to express his thanks to Professor Shiing-shen Chern for his valuable 
suggestions. 


1E. Cartan, Sur la déformation projective des surfaces, Annales Ec. Norm. Sup., (3), 


37(1920), pp. 259-356. 

2 E. Bomprant, Sulle varieta anolonome, 1,2, Rend. R. Accad. Lincei, serie VI, 27(1938), 
pp. 37-52. 

Bomprant, ibid. 

4 E. Cartan, ibid. 


562 


(7) 


with 
(3) 
For 
lines 
linea 
so tl 
(4) 
Fi 
inva 
whic 
Are 
simp 
(5) 
The 
(6) 
It 
struc 
|__| 


ble 
(3). 
8), 


NON-HOLONOMIC SURFACE 563 


(dA = wA + + + woAs, 
dA, = wiA + wiA; + wide + 
dA, = wA + + + w2As, 
= w3A + w3Ai + + 


(2) 


with the relation 
(3) w + wi + w2 + w3 = 0. 
For simplicity we shall write, in what follows, w’ for #) , i = 1, 2,3. Since the 


lines AA; , AA remain fixed when A is fixed, it follows that wi , w2 , wi , w2 are 


linear combinations of w', w, w. But the asymptotic directions are given by 


3 13 2 3 
wo = 0 = 0 


so that is a linear combination of w’, and a linear combination of w', w° 
only. We can therefore put 


= bw + cw’, we = b*w' + ctw’, 


ws aw + Bu" + yw, ws = + + 


(4) 


From this family of frames we shall choose a sub-family characterized by 
invariant conditions. Let us denote by 6 the symbol of differentiation under 
which only the secondary parameters vary, the coordinates x, y, z of the point 
A remaining fixed, and by d a symbol under which all the variables vary. For 
simplicity of notation we write 


(5) wi(D) = wi (5) = i,j = 0,1, 2, 3. 
Then we have 
(6) 


It is well-known that the Pfaffian forms in (2) satisfy the “equations of 
structure”: 


= [(wo — wi)w'] + + 
(w*)’ = — + [w'wi] + [wes], 
= [(wo — + [w'wi] + 
(7) (wt)! = [ot — + + 
(w2)’ = [(w2 — ws)w2] + [w2w'] + 
(wi)! = [(wt — + + [wies], 


(we)! = [(w2 — wi)o2] + [wrw'] + [wrws], 


unt 
be 
ce, 
we 
‘ial 
1ct 
he 
on 
the 
ral 
ta 
ire 
le- 
ch 
fy- 
WO 
ily 


564 HSIEN-CHUNG WANG 


from which it follows that 


= — ei)w' — egw’, 5a” = — — 
= (€2 — €3)w2 + ew, dwi = — + — 
| = (€2 — ei)w2 + e2w' — 


Equations (4) and (8) then give 
(sb = (ei tes — — e3)b, = e + + (et — 


Sa = (2ei — — ed)a, by = (ex — — 
(9) ¢ + esa + — cic, 
= — — ei)a*, = (e + — — e1)y* 


+ eja* + — ejct, 
| 68 = (ei — + er — e3b, = — + — 


We shall suppose that the tangent plane of the non-holonomic surface depends 
on three essential parameters. Then we have 


bb* ¥ 0. 


Equations (9) show that we can choose the secondary parameters such that the 
following conditions are satisfied: 


b*=1, = =0, 


i.e. 
b 2 1 1 
wi = bw, = 
(10) b 
wr = aw + yw’, = + 
To keep these conditions unaltered, we must have 
(11) 
Equations (8) then become 
dw! = — dw” = — da” = 2eQw’, 
= Qeiwi = — 2elws 


The family of frames for which the conditions (10) are fulfilled is called the 
family of canonical frames. The canonical frames at a point A depend on three 


ie. 


secon 
expre 
(8) (13) 
is th 
secol 
nomi 
3. 
the 
Let 
Pliic 
Furt 
[P\P 
Fe 
The 
= 
It fe 
Sly 
Sl, 
Sl, 
Sly 
whe 
thro 
|_| 


the 


the 


NON-HOLONOMIC SURFACE 565 


secondary parameters. By making use of relations (12), we can verify that the 
expression 
wi — we — wwe) 
(w*)? 
is the quotient of two differential forms in 2, y, z and is independent of the 


secondary parameters. It is therefore a projective invariant of the non-holo- 
nomic surface, which we shall call its projective linear element. 


(13) do’ = 


3. Geometrical interpretation of the projective linear element. To interpret 
the projective linear element geometrically, we employ a notation due to Fubini.° 
Let P; , P2 be two analytic points. We denote by [P,P2] the analytic line with 
Pliickerian coordinates equal respectively to the minors of the matrix || P:P2 || . 
Furthermore, if [P:P2] and [P1P] define two lines, then the scalar product 
stands for the determinant | P,P2P}P2 |. Thus two lines 
[PPs] intersect if and only if 


S[P:P2][PiPs] = 0. 


For a canonical frame AA,A2A3 the lines AA; and AA? are the asymptotic 
tangents at A. Let us put 


[A Aj], = [A Ao]. 


The asymptotic tangents l,, at a neighboring point A’ of A are given by 
the Taylor’s expansions 


i, = [AA] + [A + + 3{[4 + + A} + 
a= 1,2. 

It follows that 

(Shli = (w'wi — w'wi) + (3), 

| Sle = — + (3), 

= + — — + w'(wo + 03) + w'wl + + (3), 

Sali = w+ (ww — wi) + + + ws) + + wwe} + (3), 


where (3) denotes a sum of terms of at least the third order. A line Ayl; + Ask 
through A on z intersects [; if and only if 


SQul + = 0, 


Le. 
Sk li 


de Shi 


5G. Fusini er Cecu, Géométrie projective différentialle des surfaces, Paris, 1931, pp. 4-6. 


7 
= 
ree 


566 HSIEN-CHUNG WANG 


Similarly, a line pl; + pel through A on z intersects [3 if and only if 


pi _ _ Shela 
p2 


The four lines l,, 2, li, I: determine four points on the line of intersection 
of the tangent planes at A and A’. The cross ratio H of these four points is 
found to be 


pr (w*)? 
Hence the projective linear element can be interpreted as follows: 

Let l, and |, be the asymptotic tangents at A, and 1; , I be those at a neighboring 
point A’ of A not on. Then the four lines l,, le, li, lz intersect the line of inter- 
section of the tangent planes x, x’ at A, A’ in four points whose cross ratio is equal 
to do’. 


4. The main theorem and two lemmas. Two non-holonomic surfaces which 
are projectively equivalent have the same projective linear element. It is 
easily verified that this remains true if they are related by a correlation. Our 
main theorem asserts that the converse also holds. We shall formulate it as 
follows: 

THEeorEeM. Suppose S and S be two non-holonomic surfaces whose tangent 
planes depend on three essential parameters. If there is a correspondence between 
S and S such that their projective linear elements are equal, then there exists a 
collineation or a correlation carrying S to S. 

Before proving the theorem, we shall establish two lemmas. 

Lemma 1. Let S and S be two non-holonomic surfaces having the same projec- 
tive linear element and R be a family of canonical frames AA,A2Az3 of S such that 
at each point there is one and only one frame of R. Then there exists a family R of 
canonical frames AA,A2A; of S having the same property and satisfying either 
the conditions 


wo =o, =o, =a, = = w2, 
1 
2 


(14) 
@1 = 


or the conditions 


(14’) 2 -2 1 -1 0 3 =s«-:0 -3 
= —O1, = Wo W3 = Wo — 3, 


where wi and a (i,j = 0, 1, 2, 3) are the Pfaffian forms corresponding to the two 
families R and R of frames respectively. 

Proor: We first take R to be the family of all the canonical frames attached 
to different points of the space, and let w; (i, 7 = 0, 1, 2, 3) be the Pfaffian forms 


in 
(15) 
The 
(16) 
Mal 
(17) 
fron 
(18) 
so tl 
cove 
whe 
The 
(19) 
(20) 

Whe 
in s 
is a 
F 
into 


‘ion 
S is 


ter- 


two 


NON-HOLONOMIC SURFACE 567 
in (2) satisfied by frames of this family. From the identity of the projective 
linear elements 


(w’ wi — w'wi)(w'w, — (@ a1 a1) (a — 


(15) (wi)? 


we have 
3 -3 


= pw. 
The third equation of (12) shows that we can choose a sub-family of R such that 


(16) w =o. 

Making use of this equation, we have, from (15) 

(17) — — wei) = Wal — — 
from which it follows 

(18) (ww)? = (a'a’) 


so that &', & are multiples of w' or w. On the other hand, forming the bilinear 
covariant of (16), we get 


— — ab + + (6 -(6 = 0 


where 6 is defined by 


@ = ba. 


The above equation holds, only when 


(19) wo — w3 — — G3) = rw, ( w] — (5 = 0. 


From (12) and the relation 


we see that we can make 
(20) w = a, &. 
When these conditions hold, the family R of frames is such that to each point 
in space there is attached one and only one frame of R. In other words, there 
is a one-to-one correspondence between the frames of R and R. 

From equations (18) and (20) we get 

w = +e. 

According as the upper or the lower sign holds, we shall divide the discussion 
into two cases: 


ing 
ual 
ich 
ent 
sa 
jec- 
hat 
of 
-1 
1ed 
ms 


568 HSIEN-CHUNG WANG 


Case 1. w = @. Then we have 


(b — 6)(bb + 1) = 


If b = 6, equations (14) follow as a consequence of (17) and the equations already 
obtained. If bb + 1 = 0, we change the family of frames R to the family R* 
of frames A*A}A?43 , related to AA142A; as follows: 


A* =A, Gata, Az =bA:, = —As. 


b 
The two families R* and # then satisfy the relations (14’). 
Casz 2. w = —&. We have then (b + 6)(bb — 1) = 0. For the case 


b + b = 0, equation (17), together with the change of frame 
A*=A, AL=A,, At AS =As 


gives (14), while for the case bb — 1 = 0, equation (17), together with the change 
of frame 


gives (14’). Thus the lemma is proved. 

Lemma 2. Suppose there be two families of canonical frames AAjA2A;, 
AA,A2A; of two non-holonomic surfaces S and S respectively. If the relations 
(14) hold, then we have 


= W2, = W3, @ = Ws. 


PROOF. ince the relations (14) hold, we take their bilinear covariants and 
obtain 


f [(wo = wi + = 0, 
— of — + + 163 — = 


b[(wi — w3 — + + — = 0, 


— w3 — + + — = 0, 
(22) 


+ [(w2 + — — + 2[(w; — a3)w'] = 0, 
[(ot — — + + + — — + = 0, 


— wt — & + + y*w°)] + [(o2 ae +} @3)w'] = 0 
\ 


whel 


(23) 


(24) 


A ec 
(25) 


fron 
(26) 
Sup 


The 


i 

fron 
case 
botl 
Fro 
It 
(27) 


ady 
R* 


nge 


nd 


NON-HOLONOMIC SURFACE 569 


where b, a, a*, y, y* are given by (10). From (22) we can put 


(we — wi — Go + = Te’, — ws = Tw! + we’, 

wo — we — + = rw’, — ws = re + ute 

—w;=} + (u*w' + uw’) + 


* aking the bilinear covariant of the first two equations of (23), we obtain 
— + — bas — w2 + + — 
= [{dr + r(wo — w3)}o"] + [w'w"], 


(24) 


— + [(@ — wi + + [(a3 — w3)w"] 


| = [fdr* + — + r* (6 


A comparison of the coefficients of [w'w'] on both sides of (24) gives 


(25) r= (4-1), r= ist 


from which we get 

(26) (b* — 40° + 1)r = 0. 

Suppose that b‘ — 4b° + 1 ¥ 0. It follows from (26) and (24) that 
= 0. 


The relations (23) then become 


0 -0 -0 43 
wo — = @ — w3 = = a1, @3 — w3 = 
wo — we = — a2, w3 = we = we 


from which, by forming the bilinear covariants, we get v = 0. If this is the 
case, the lemma is established. If b* — 4b° + 1 = 0, while a and a* do not — 
both vanish, we still have, on account of (25) and the last two equations of (22), 


r= = 0. 


From these equations, the lemma can be established as above. 
It remains for us to consider the case 


(27) a=at=)'— 4b 4+1=0. 


570 HSIEN-CHUNG WANG 
Let 

(28) dr = rw + rw + rw’, dr* = + rw 
w) — = aw + aw + aw’. 


Then equations (24) give 


1 
gut (I 3b) = re, gu (> 3) + rte, 
(29) 
from which we have, by making use of (24), 
u=u* = 0. 
Then equations (23) take the following form: 
(wy — wi — + = Te’, — ws = Tw, — = dra’, 
(30) rw, os — r*w’, os — ws = rw, 
— 


Now the Pfaffian forms w3 are not arbitrary. In fact, by taking the bilinear 
covariants of the first two equations of (10), we get 


+ + + w2 — wo — + + = 0, 


(6 + + + ws — wo — wslw'] + («! + ; = 0. 


Thus we can write 


1 
wi + bws = + H+ + (++ 
1 1 
Two cases must now be considered separately—the case y = y* = O and the 
case by which y and y* do not both vanish. In the former case, we get r = 1* = 
v = 0, the derivation of which involves a long calculation, whose details we omit 


here. In the latter case, we assume, for definiteness, y ~ 0. By forming the 
bilinear covariant of the identity 


— wi) — br(as — ws) = 0 


we obtain 


( + + +t + + = 0. 


4 
Our 
Pr 
havi 
fami 
tions 
S ca 
plan 
a 
Ther 
and 
whic 
2 


lear 


NON-HOLONOMIC SURFACE 571 


This equation, together with equations (25) and (27), also gives 
r=r*=y=0. 


Our proof is therefore complete. 

PROOF OF THE MAIN THEOREM. Let S and S be two non-holonomic surfaces 
having the same projective linear element. From lemma 1, there are two 
families of canonical frames of S and S respectively such that either the condi- 
tions (14) or the conditions (14’) hold. In the former case, lemma 2 shows that 
S can be transformed to S by a collineation. In the latter case, we introduce 
plane coordinates by putting 


a= —[AA,Ap)], = [A A,Asl, aa = a3 = [A;AeA3]. 


Then a, a; , , a3 satisfy the equations’ 


3 3 3 3 
(da = —w3a — — wide — was, 
2 2 2 2 
1 1 1 1 
daz = — we, — wide — w Q;3, 
0 0 0 0 
das = —w3d — — wide — w Az 
and lemma 2 gives 
—W3 = 00, —@3 = 1, —@3 = = —@2 = @1 
—We = —@1 = @2, —@1 = W3, = 3, 


which proves that the non-holonomic surfaces S and S are related by a correla- 
tion. Hence our main theorem is proved. 


Tstnc Hua UNIVERsITY, 
KunMING, CHINA. 


6G. Fusrni et Ceca, ibid. p. 220. 


2 
the 
nit 
he 


